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SEQUENTIAL  RANKING  PROCEDURES 

By 
Elias  Alphonse  Pcirent,  Jr. 

1.   Introduction.   Many  statistical  procedures  used  and  studied 
today  are  sequential  in  nature.   By  this  we  mean  that  the  time  when  a 
statistical  decision  is  reached  is  random.   In  contrast  to  such  proce- 
dures are  the  fixed  sample  size  procedures.   Best  known  perhaps  is 
sequential  analysis  and  the  sequential  probability  ratio  test  as  formu- 
lated by  Wald  [6].   There  are  other  sequential  procedures,  for  example 
in  process  inspection  schemes,  where,  based  on  a  sequence  of  observations 
a  decision  is  made  to  stop  the  process  and  take  some  adjusting  action, 
the  time  at  which  the  process  is  stopped  being  a  random  variable.   There 
are  many  other  sequential-like  procedures . 

In  the  theory  of  hypothesis  testing  for  the  case  of  a  simple  hypo- 
thesis against  a  simple  alternative  it  is  known  that  a  most  powerful  test 
can  be  determined  by  the  Neyman- Pears on  lemma,  which  is  of  the  form: 


MX1>  \>     -"    >    Xn} 
reject   f  -  f   if  An  =  f  (y  ,  X  ,  . . .  ,  X  ) 

0  l        d  n 


where  the  hypotheses  to  be  tested  are   f  =  f   against  f  =  f  ,  f   and 
f   are  the  joint  densities  of  the  observations  X  ,  XQ,  ...  ,  X  ,  corre- 
sponding to  each  hypothesis.   This  is  an  example  of  a  nonsequential 
procedure.   To  extend  such  a  procedure  to  the  sequential  idea  we  need 
only  modify  the  test  as  follows: 

take  a  sample  of  size  of  size  m  and 

reject  f_   if  A  >  K. 
°  0       m  —  1 

accept  f.   if  A  <  K_ 
0       m  —  2 

draw  another  sample  of  size   n-m  if  K^  <  A  <  K, 

;2    m    1 


if  the  second  sample  is  required  compute  A   and 

reject  f_   if  A  >  K 
On 

accept  f   if  A  <  K  . 
0       n  - 

Such  a  simple  modification  gives  us  a  two  stage  procedure  with  a  new 

feature  in  that  the  total  sample  size  is  random,  being  either  m  or  n, 

depending  upon  the  outcome  of  the  first  stage.   This  basic  idea  of  a 

sequential  test  was  proposed  by  Dodge  and  Romig  in  [8],  and  has  been 

extended  to  multiple  stage  sampling  plans. 

Sequential  hypothesis  testing  as  proposed  by  Wald  requires  that 

a  computation  of  A   and  a  decision  be  made  as  each  observation  is 

n 

taken.   Briefly,  to  test  f  =  f   against  f  =  f,   select  constants 
B  <  A  and  compute  A   as  each  observation  is  taken,  and  proceed 
according  to  the  rule 

if  A  >  A  reject  f  =  f . 
n  -      °  0 

if  A  <  B  reject  f  =  f . 
n  —  1 

if  B  <  A  <  A  take  another  observation  and  compute  A  ,. 
n  n+1 

Since  the  sequential  probability  ratio  test  is  formulated  in 
terms  of  the  ratio  which  leads  to  most  powerful  tests  according  to 
the  Neyman-Fearson  theory  we  would  expect  it  to  have  good  properties. 
This  indeed  is  the  case  in  that  of  all  tests  with  the  same  power  the 
sequential  probability  ratio  test  requires  on  the  average  fewest  obser- 
vations.  This  optimal  property  was  conjectured  by  Wald  and  finally 
proved  by  Wald  and  Wolfowitz  in  [9]- 


In  order  to  carry  out  these  sequential  tests  of  hypotheses  we  note 
that  an  assumption  as  to  the  specific  form  of  f~  and  f   must  be 
made.   It  often  happens  that  the  form  of  the  underlying  distribution 
is  not  assumed  known  and  in  this  case  nonparametric  statistical  methods 
are  used.   In  nonparametric  statistics  many  tests  of  statistical  hypo- 
theses are  based  on  the  set  of  ranks   f Tn ,  1U,  ...  ,  T  )   determined 

1  1'  2'  '      n 

from  a  random  sample   [X, ,  X ,    . . .  ,  X  ),      or  the  signs  of  the  obser- 
vations  (+  1  according  as  X.   in  positive  or  negative)  or  on  a 
combination  of  both  of  these  sets  of  statistics  derived  from  the  basic 
observations.   The  sign  test,  signed  rank  test,  Wilcoxon-Mann-Whitney 
test,  Fisher-Yates  test  and  many  others  are  examples  of  such  fixed 
sample  size  nonparametric  tests. 

Contrary  to  the  case  in  parametric  statistics  (as  opposed  to  non- 
parametric statistics)  there  are  very  few  sequential  procedures  in 
nonparametric  statistics,  particularly  sequential  procedures  based  on 
signs,  ranks,  or  both.   One  reason  for  this  is  that  for  most  specified 
alternatives  to  the  null  hypothesis  it  is  difficult  to  compute  proba- 
bilities for  statistics  based  on  signs  and  ranks  which  in  turn  makes 
it  difficult  to  properly  evaluate  the  properties  and  operating  charac- 
teristics of  the  procedures.   This  difficulty  can  be  circumvented  by 
restricting  attention  to  special  classes  of  alternatives  such  as  those 
proposed  by  Lehmann  in  [l],  where  to  the  null  hypothesis   F(x)   he 
proposed  alternatives  of  the  form  F  (x),  a  >  0.   This  of  course  does 
not  solve  the  basic  problem  of  alternatives  as  the  question  of  whether 
or  not  the  Lehmann  alternative  is  appropriate  for  the  problem  being 
considered  arises.   However  it  is  a  first  step  inasmuch  as  it  does 
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allow  us  to  develop  some  sequential  procedures  where  exact  distribution 
theory  calculations  are  possible.   In  the  fixed  sample  size  problem  it 
simplifies  considerations  of  power  of  rank  tests. 

An  example  of  a  nonparametric  sequential  test  is  the  following 
adaptation  of  Wald's  sequential  probability  ratio  test  for  binomial 
observations.   Consider  a  sequence  of  independent  identically  distrib- 
uted random  variables  X   ,    X   ,    ...   with  cumulative  distribution  function 
F(t)  =  P(X1  <  t).   We  wish  to  test  F(tQ)  =  pQ  against  F(t  )  =  p 
for  some  fixed  value  t  .   The  number  of  observations  less  than  or 
equal  to  t  ,      say  N,   after  taking  n  observations,  is  a  binomial 
random  variable  with  parameters  F(t_)  and  n.   The  probability  ratio 
reduces  to 


P(N|  F(t0)  -  Pl)    ,Pl  l-p0  s"  ,   l.Pl  v» 


and  the  sequential  test  based  on  this  ratio  is  discussed  in  Wald  [6]. 
For  the  special  case  where  tn  =  0,   N  is  equivalent  to  the  number  of 
negative  observations  after  n  trials  and  this  would  be  a  sequential 
test  based  on  the  signs  of  the  observations. 

An  example  of  a  nonparametric  sequential  procedure  based  on  ranks 
of  observations  is  the  grouped  rank  test  developed  by  Wilcoxon,  Rhodes 
and  Eradley  [k].      Actually  two  sequential  procedures  are  developed  in 
[h],    the  Configural  Rank  Test  and  the  Rank  Sum  Test.   Basically,  obser- 
vations are  taken  in  groups  of  m  X's  and  n  Y's  and  the  observations 
are  ranked  within  each  group.   For  each  group  a  statistic  is  computed 


based  on  the  ranks  and  Wald's  sequential  probability  ratio  test  is 
applied  to  the  sequence  of  statistics  so  generated.   Each  group  of  m 
X's  and  n  Y's  becomes  the  basic  unit  used  in  the  probability  ratio. 
Suppose  the  X-  population  has  distribution  F(x)   and  the  Y-  population 
has  distribution  G(y),   and  observations  are  taken  as  follows 

(X11'  X12'  •'•  >  Xlm'  Yll>  Y12>  ••'  V   "   S™^1 
(X21>  X22'  ~"    X2m'  Y21>  Y22,  •'•  Y2n}   "   grOUP  2 


(V  V  -•  >  V  V  V  -  -  V   - group7 


Let  R7  =  (Rn,  Ry2,    ...  ,  R7m,  S^    S^,  ...  ,  S^)   be  the  rank 

vector  associated  with  group  7  where  R  .   is  the  rank  of  X  .  and 

S  .   is  the  rank  of  Y  .  ,  the  ranks  taken  from  the  combined  ranking  of 
7i  7i 

the  X's  and  Y's.   Taking  a  function  of  R  ,      say  T  =  T(R  ),  we 
generate  a  new  sequence  of  random  variables   T  ,  T  ,  ...   and  the  Wald 
sequential  probability  ratio  test  may  now  be  applied  to  the  T. .   For 
independent  group  to  group  sampling  we  have 


n  P(T  |  Y  ~  G(y)) 

(1'2)  An  =  II  P(T  |  Y~F(y)) 

7=1 


as  the  probability  ratio  to  test  the  hypothesis  that  the   Y-  population 

has  distribution  F(y)  against  G(y) .   In  [h]   the  authors  consider 

_k 
Lehmann  alternatives   G(y)  =  F  (y) ,      k  >  0  and  the  function  T  in 

one  case  is  the  actual  configuration  of  X's  and  Y's,  which  is 
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equivalent  to  the  vector   (S   ,  S  ...  ,  S  ),      and  in  the  second 

case  T  is  taken  to  "be  the  sum  of  the  Y  ranks . 

Wilcoxon,  Rhodes  and  Bradley  observe  that  the  test  could  he 
improved  "by  taking  observations  in  pairs  and  reranking  from  the  begin- 
ning each  time  a  new  observation  pair  is  taken.   One  reason  for  the 
reduced  efficiency  of  the  group  ranking  method  is  that  the  observations 
in  one  group  are  not  compared  with  observations  from  any  other  group. 
The  reranking  suggestion  would  take  into  account  all  comparisons.   How- 
ever, this  is  very  cumbersome,  and  moreover  reranking  introduces  non- 
independence  of  successive  probability  ratios  making  an  analysis  of  the 
properties  of  such  a  procedure  difficult. 

Thus  in  order  to  attack  the  problem  of  nonparametric  sequential 
tests  of  hypotheses  based  on  ranks  we  should  consider  procedures  such 
that  the  distribution  theory  is  tractable  and  such  that  ranks  are 
assigned  in  a  truly  sequential  manner,  avoiding  as  much  as  possible 
the  complexities  introduced  by  reranking.   To  this  end  two  new  sequen- 
tial ranking  methods  will  be  defined  in  this  dissertation. 

In  order  to  be  led  somewhat  naturally  to  these  new  ranking  methods 

we  now  consider  the  reranking  procedure  in  more  detail.   Let  T    be 

the  rank  of  X.   at  the   i    stage  in  the  reranking  process.   We 
J 

observe  X„.  X^,  ....  X  ,  ...   and  each  time  a  new  observation  is 
1'      27  '      n 

taken  the  entire  set  of  observations  is  reranked.   We  have 


Observation  vectors  Rank  vectors 

(X1'  V  <T21'  T22> 

(X1'  V  V  <V  T52'  V 

(X1'  V  •  •  •  '  V  <Tnl'  **>    ■■■    Tnn> 

Notice  that  the  vector  (T, ,,  T ,  . ..  ,    T  )   completely  deter- 
mines the  n  rank  vectors  listed  above  in  the  sense  that  each  vector 

could  be  reconstructed  given  only  T. .   i  =  1,  2,  ...  ,  n.   T. .   is 

11  '  n 

the  rank  of  X.   relative  to  the  set   fXn ,  X^,  ...  ,  X.}.   Thus  we 
i  1   2'  '      i 

can  rank  an  observation  as  it  is  observed,  relative  to  the  preceeding 
observations  without  reranking  the  previous  observations  and  still 
retain  the  information  contained  in  the  n  rank  vectors  which  would 
come  from  reranking.   This  method  of  ranking  observations  is  one  way 
of  assigning  ranks  which  fits  in  naturally  with  the  idea  of  sequential 
procedures  and  lends  itself  to  developing  sequential  procedures  in  non- 
parametric  problems.   This  ranking  procedure  also  takes  into  account 
all  comparisons  among  the  observations. 

Analogous  to  the  fixed  sample  size  signed  rank  test  we  will  define 
a  second  sequential  ranking  procedure  based  upon  the  absolute  values  of 
the  observations  and  taking  into  account  the  signs  of  the  observations. 
This  signed  sequential  ranking  procedure  will  be  applied  to  a  problem 
in  process  control.   By  process  control  we  mean  a  procedure  where  the 
aim  is  to  determine  when  a  given  sequence  of  random  variables  changes 


from  being  distributed  according  to  a  distribution  F(x)   to  a  different 
distribution  G(x) .   The  term  process  control  enjoys  a  broader  definition 
today  including  those  cases  where  the  process  is  adjusted  according  to 
some  statistic  based  upon  the  sequence  of  observations.   Such  proce- 
dures are  referred  to  as  adaptive  control  methods. 

The  early  methods  used  to  control  a  process  were  based  on  control 
charts  (Shewhart  charts)  and  modifications  of  these  control  charts. 
To  control  the  mean  value  of  some  dimension  of  a  process  at  a  particular 
value  u  ,   samples  of  size  n  are  taken  at  frequent  intervals  of  time 
and  the  sample  mean  X  is  compared  with  u.  +  ka/-/n  .   If  X  falls 
outside  these  lines  the  process  is  stopped  and  adjustments  to  the 

process  are  carried  out,  and  for  u   -  ko/i/n  <  X  <  u  +  ka/vn  the 

o        —   —  o 

process  is  allowed  to  continue  without  adjustment.   Modifications  to 

the  basic  control  chart  method  came  in  the  form  of  "warning  lines" 

inside  the  action  lines  u  +  ka/-/n  .   Further  modifications  were 

o  — 

introduced  which  changed  the  action  rule  to  rules  of  the  type  "if  K 

consecutive  points  on  the  chart  fall  outside  control  lines ,    take  action." 

These  early  procedures  failed  to  take  advantage  of  all  the  information 

contained  in  the  sequence  X_.  X_,  ...  ,  X  .   At  best  the  modified 

l7   27     7      n 

action  rules  used  only  the  information  contained  in  a  fixed  number  of 
sample  values  in  the  immediate  past. 

In  order  to  take  advantage  of  this  unused  information  the  stopping 
rule  should  incorporate  the  entire  sample,.   A  step  in  this  direction 
was  taken  by  Page  in  [7]  with  the  introduction  of  cumulative  sum 

schemes.   If  the  mean  of  a  process  is  to  be  controlled  the  cumulative 

n 

sums  S  =  Y   (X.  -  k)  are  plotted  on  a  chart  against  n.   The  entire 
n   .*<, N  i 
i=l 
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history  of  the  process  is  presented  and  changes  in  the  process  mean  are 
visible  through  changes  in  direction  of  the  mean  path.   To  detect  one- 
sided deviations  in  the  mean,  say  increases,  the  stopping  rule  used  is 

to  stop  the  process  when  the  current  point  of  the  path   (n,    )   rises 

n 

a  given  amount  h  >  0  above  the  previous  lowest  point  of  the  path. 
Two-sided  deviations  are  treated  by  applying  two  one-sided  schemes 
simultaneously.   For  normal  observations  the  cumulative  sum  schemes 
have  been  found  to  be  more  sensitive  than  the  Shewhart  control  chart. 
When  no  assumption  is  made  as  to  the  form  of  the  underlying  dis- 
tributions we  might  look  to  non  parametric  methods  for  a  control 
procedure.   For  example,  the  sequential  rank  of  X.   is  equally  likely 
to  be  1,  2,  .  . .  ,  i  as  long  as  no  change  takes  place  in  the  distri- 
bution of  X  ,  X  ,  ...  ,  X..   But  when  a  location  change  takes  place, 
say  an  increase  in  the  process  mean,  larger  ranks  would  be  more  probable. 
We  will  consider  the  sequential  rank  of   |x. |   relative  to   |X  |,  |x  |,... 

|X.|,   multiplied  by  the  sign  of  X.(+  1   if  X.  >  0  and  -1   if  X.  <  0) 
1  i"  i         l-  i 

in  a  process  control  problem.   This  method  of  sequentially  assigning 
ranks,  as  noted  before,  will  be  called  signed  sequential  ranking. 

This  dissertation  defines  two  methods  of  assigning  ranks  in  a 
sequential  manner  to  observations  X  ,  X   . . .   .   Basic  properties  of 
the  sequential  ranks  are  studied  and  distribution  theory  is  determined. 
Section  2  contains  some  preliminary  results  including  some  relating  to 
order  statistics  of  observations  taken  from  non  identical  distributions. 
These  results  are  used  in  the  later  sections.   In  Section  3  the  method 
of  sequential  ranking  is  defined  and  it  is  shown  that  for  a  fixed  sample 
size,  ordinary  ranks  and  sequential  ranks  are  equivalent  for  the  purpose 
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of  hypothesis  testing »   Section  h   is  an  application  to  sequential  hypo- 
thesis testing  for  the  two  sample  problem  where  the  alternative  is  of 
the  form  proposed  by  Lehmann  in  [l].   The  signed  sequential  ranking 
scheme  is  defined  in  Section  5  and  a  condition  on  the  distribution  of 
the  sequence  of  observations  is  given  which  implies  that  the  signed 
sequential  ranks  are  independent.   Distribution  theory  is  given  for  the 
signed  sequential  ranks.   Section  6  contains  an  application  of  signed 
sequential  ranking  to  a  process  control  problem . 

2.   Preliminary  results.   Let  X  ,   X  ,    ...  ,   X   be  any  random 

variables  with  continuous  comulative  distribution  functions   F  , 

R, ,  ...  ,  F  .   Define  X     to  be  the  k    smallest  in  the  set 
2'     '   n  nk 

(X  ,  X  ,  . ..  ,   X  }.   We  can  obtain  a  general  expression  for  the  distri- 
bution of  X    as  follows: 
nk 

(2.1)  Fnk(x)  -  P(Xnk  <  x) 


n 

£,  P(i  X's  are  <  x  and  n-i  X's  are  >  x) 
i=k 


Letting  E.   denote  the  event   [i  X's   are  <  x  and  n-i  X's  are  >  x] 
there  are   (?)  ways  to  select  the  X's  which  are  less  than  or  equal  to 
x}      and  a  typical  way  in  which  E.   could  occur  is 

E..=[X.   <  x,  . ..  ,  X.   <  x,  x  <  X.    ,  ...  ,  x  <  X   ] 

U     J1  Ji  Ji+1  Jn 

where   j  =  1,  2,  . . .  ,  ( . )   to  take  into  account  all  possible  cases. 
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For  j   4   y      the  events  E..   and  E...   are  disjoint  and  E.  =  UE.. 
7  U        U  °  1   j  ij 

Thus  ve  have 


F  (x)  =  £p(E  )  =  I       [P(E   ) 
nk      i=k   X    i=k  j=l   1J 


and  further,  when  the  X.  are  assumed  to  be  independent  we  obtain 

i  n 

P(E   )  =  P(X   <  x)       (1  -  P(X   <  x))  . 

i  m        -4.1         m 

m=l  m=i+l 


As  a  special  case  of  (2.l),  to  be  used  later,  we  have  the  following 
result  when  the  X's   are  distributed  according  to  only  two  different 
distributions „ 

Lemma  2.1.   Let  X, ,  Xn.    ...  ,  X   be  independent  random  variables 
1   27        N 

where   (X.,  1  <  i  <  m]   are  distributed  according  to  F(x)   and 
(X.,  m  +  1  <  i  <  N}   are  distributed  according  to  G(x).   Then 


N   i     „     „ 
(2.2)         F   fx)  =  I   ZffHf")  FJ(x)  (1-F(x))m-J 
im     i=k  j=o  J  1_J 


g^mu-gU))^-1^ 


Proof:   Each  of  the  basic  events  E.   (defined  above)  can  be 

written  as  a  union  of  disjoint  events  E   ,   j  =  0,  1,  2,  . . .  ,  i  where 

E    consists  of  j   X's   (with  distribution  F(x))  <  x  and  i  -  j   X's 
ij 

(with  distribution  G(x))  <  x,   the  remaining  X's  are  >  x.   There  are 

(m)  /N_m)  vays  to  select  such  an  event,  each  having  probability 
j   i-J 

FJ(x)  (l-F(x))m"J  Gi_J(x)  (l-G(x))N"m"1^.   We  use  the  convention  that 
(^)  =  0   if  a  <  b. 
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Remark:   When  G  =  F  we  can  use  the  fact  that  J     (  . )(  .'~  . )  =  ( . ) 

jt0  J   i-J     i 

to  get  the  known  result 


(2.3) 


N 


W*)  =  I   (•)  F^xJd-FCx)) 


N-i 


i=k 


In  order  to  derive  the  distribution  theory  associated  with  the 
sequential  ranking  procedures  proposed  in  this  paper  the  next  lemma 
will  "be  useful .   We  consider  a  random  variable  X  with  a  continuous 
distribution  function  F(x)  and  define  the  sign  of  X  to  be  1  if 
X  >  0  and  -1   if  X  <  0,   Letting  E  =  sign  of  X,   we  can  compute  the 
joint  distribution  function  for  E  and   |x|   as 


(2.U)    F(x,y)  =  < 


0  -oo<y<0j-°°<x<°o 

0  -    oo    <   y   <   oo,     -oo<x<-l 

F(0)  -  F(-y)  0  <  y  <  <*>,    -1  <  x  <  1 

F(y)  -  F(-y)  0  <  y  <  ~,    1  <  x  <  *> 


where  F(x,y)  =  P(E  <  x,  |x|  <  y), 

since  for  -oo<y<0,  -°°<x<<»,  |x|  >  0  with  probability  1  implies 
F(x^y)  =  0,   for   -oo<y<oo,  -  w  <  x  <  -  1,   E  =  +  1  with  probability 
1  implies   F(x,y)  =  0,   for  0  <  y  <  »,  -  1  <  x  <  1,   F(x,y) 
=  P(-y  <  X  <  0)  =  F(0)  -  F(-y)   and  for  0<y<~,   l<x<°o, 
F(x,y)  =  P(-y  <  X  <  y)  =  F(y)  -  F(-y). 

In  developing  the  properties  of  the  signed  sequential  rank  an 
important  role  will  be  played  by  the  dependency  of  the  sign  of  X  and 
|x|   and  thus  we  establish  a  condition  whereby  E  and   |x|   are 
independent  random  variables  in 
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Lemma  2.2   |x|   and  sign  of  x  (=  E)  are  independent  if  and  only  if 
F(-x)  =  F(0)  [1  -  F(x)  +  F(-x)]   for  all  x  >  0. 

Proof:   The  marginal  distribution  for  E  and   |x|   are 


P(E  <  x)  =  < 


0  x  <  -  1 

F(0)    -1  <  x  <  1  and  P(|x|<y) 

1  1  <  x 


=  < 


0  y  <  0 

F(y)  -  F(-y)    0  <  y 


I 


and  the  product  of  the  marginals  is 


<  y  <  o, 


<  y   <  co, 


P(E  <  x)  P(|X|  <  y)  =  <^  F(0)[F(y)  -  F(-y)]    0  <  y  <  «,, 


F(y)  -  F(-y)     0  <  y  <  oo, 


-  oo  <  X  <  co 

-  oo  <  X  <  -  1 

-  1  <  X  <  1 
1  <  X  <  oo 


Thus  the  joint  distribution  function  of  E  and   |X|   will  factor 
into  the  product  of  the  marginal  distributions  if  and  only  if 
F(0)  -  F(-y)  =  F(0)  [F(y)  -  F(-y)]   for  all  0  <  y  which  is  equivalent 
to  the  condition  in  the  lemma. 

Remark:   Throughout,  we  will  assume  that  the  basic  random  variables, 
usually  denoted  by  X  or  Y,   are  defined  on  the  same  probability  space 
and  have  continuous  cumulative  distribution  functions.   Thus  the  ranking 
procedures  to  be  defined  will  always  be  determined  uniquely  except 
possibly  for  sets  of  measure  zero. 
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3°   The  Sequential  Rank.   In  the  introduction  we  mentioned  the 
possibility  of  ranking  observations  as  they  are  taken  without  reranking 
the  previous  observations.   We  make  this  idea  formal  by 

Definition  3=1   The  sequential  rank  of  X   relative  to  Xn , 
_ n  1 

X2,  ...  ,  Xn  is  k  if  Xnk  =  Xn,   k  =  1,  2,  ...  ,  n  where  Xnk  is 

the  k    smallest  in  the  set   {Xn  ,  X_, ,  ...  ,  X  1. 

k  1   2'     '      n 

Thus  the  sequential  rank  of  X   is  always  1,  the  sequential  rank 
of  X   is   1  or  2   according  as  X  <  X   or  X  <  X  ,  the  sequential 
rank  of  X   is  1,  2  or  3  according  as  X   is  the  smallest,  next  largest 
or  largest  of  the  set   [Xn,  X_,  X..},  etc.   We  use  the  notation  Z. 
for  the  sequential  rank  of  X.. 

Lemma  3-1   There  is  a  one  to  one  correspondence  between  the  set 

of  nl   possible  orderings  X.   <  X.   <  ...  <  X.    and  the  ni 

12  n 

possible  sequential  rank  vectors   (Z,,  Z ,  ...  ,    Z  ). 

Proof.   We  can  consider   (X. ,  X„,  ...  ,  X  )  =  (x, ,  x0,  . ..  ,   x  ) 

1'      2.'  n      12         n 

where  the  x   are  n  distinct  real  numbers  and  the  set  {(x.  ,  x.  ,..., 
i  ix   i2 

x   )}   consisting  of  the  nl   vectors  obtained  by  permuting  the  coor- 
l 
n 

dinates  of   (x  ,  x   ,    ...  ,   x  ).   The  corresponding  set   { (X  ,    X   ,  ...  , 

X   )}   gives  the  nl   possible  orderings.   Now  define  the  mapping  cp 
i 

n 

from  the  set   f(x.  ,  x.  ,    ...  ,   x.  )}   into  the  set   ((i\,  r  ,    ...    ,    r  ): 

i  7   i         i  ±  d  n 

1    2         n  th 

rx  =  1,  r2  =  1,  2,  . ..  ,  rn  =  1,  2,    . . .  ,  n]   by  setting  the  J 

coordinate  of  cp(x.  .  x.  .....  x.  )   equal  to  the  rank  of  x .    in  the 

\        X2         Xn  Xj 

set  x   .  x   ,...x.    i.e.  the  j^11  coordinate  is   r  if  x.    is  the 

V   X2        Xj  "j 

th 
r    smallest  among  x.  ,   xio,  ...  ,  x,  . .   The  mapping  cp  is  one-to-one 

!l    d  J 

and  onto.   (This  is  almost  identical  to  part  of  the  proof  of  Theorem  1.1 

in  [2]  page  993-)= 

Ik 


By  this  lemma  we  mean  that  if  we  consider  each  ordering,  say 

X.   <  X.   <  ...  <  X.    of  a  set  of  observations   [L,  L,  ...  .  X  1 
l,     1~  l  l  1'  2*  '      nJ 

12  n 

and  use  definition  3-1  to  obtain  the  associated  sequential  rank  vector 
(Z,  Z  ,  ...    ,  Z  ),  the  sequential  rank  vector  is  uniquely  determined 
and  moreover  the  sequential  rank  vector  uniquely  determines  the  ordering. 
Since  a  particular  ordering  of  X  ,  X ,  ...    ,  X   also  determines 

the  ordinary  rank  vector   (Tn,  1L,  ...  ,  T  )   in  a  one-to-one  manner, 

1   2'  '      n 

there  exists  a  one-to-one  mapping  between  the  set  of  sequential  rank 

vectors  and  the  set  of  ordinary  rank  vectors. 

In  order  to  obtain  the  probability  distribution  for  sequential 

rank  vectors  notice  that  since  a  particular  ordering  X.   <  X.   <  ...  < 

1      2 
X.    determines  in  a  one-to-one  manner  an  ordinary  rank  vector  and  a 
n 
sequential  rank  vector,  it  is  enough  to  determine  a  mapping  from  the 

ordinary  rank  vector  determined  by  the  ordering,  to  the  sequential  rank 

vector  determined  by  the  same  ordering.   The  distribution  of  (Z  ,  Z  , 

. . o  ,  Z  )   is  then  available  for  a  wide  class  of  distributions  of  the 

'   n 

basic  variables  X. ,  X^,  ...  ,  X   since  Hoeffding  has  given  the  distri- 

1'   2         n 

but ion  of   (T  ,  T  ,  ...  ,  T  )   in  [3]. 
Consider  the  indicator  function 


*(x,y)  =  <^ 


1  if  x  <  y 
0  if  x  >  y 


and  for  X  ,  X  ,  ...  ,  X   define  the  mapping 
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cp(X  ,  X   ...  ,  X  )  =  (l,  [X(X,  X  ),  ...  ,  lx(x   X  )  , 

j=l     J  j=l     J 


...  ,  lx(x,  X  ) 

j=l     J 

th  i 

The   i    coordinate   £X(X.,  X.)   is  equal  to  the  number  of  X's  in 

3=1   J   x 

{X  ,  X  ,  .  <,  .  ,  X.}   which  are  less  than  or  equal  to  X.,   that  is,  the 

sequential  rank  of  X..   But  since  X.  <  X.   iff  T.  <  T.  (i  £  j)  we 

i  i    J        i    J 

have 


X(X.,  X.)  =  X(T.,  T.)  , 

i   J       l   J   ' 


and  this  holds  for  all   i  and  j .   Hence  we  have 

(3,1)   cp(Xx,  X2,  ...  ,  XQ)  =  cpCT^  T2,  ...  ,  Tn)  =  (Z1,  Z2,  ...  ,  Zn)  , 

and  cp  is  a  mapping  from  the  ordinary  rank  vectors  to  the  sequential 

rank  vectors  corresponding  to  a  particular  ordering  of  the  basic 

variables . 

Let  f  i  =  1,  2,  . . .  ,  n  be  continuous,  non-decreasing  functions 

defined  on  the  unit  interval  such  that  f.(0)  =  1  -  f.(l)  =  0  for  each 

l  l 

i.   Denote  by  ^(f,,  f«j  •  ••  ,    f  )   the  family  of  all   (F  ,  F '  ,    ...    ,  F  ) 

such  that  F.  =  f.(F)   where  F  runs  through  all  continuous  distributions 

Now  if  X  ,  X  ,  . ..  ,  X   are  independent  and  distributed  according  to 

F., ,  F_,  ...  ,  F  ,   Lehmann  has  shown  in  [l]  that 
1   2'     '      n 

(a)   the  distribution  of  the  ordinary  ranks   T  ,  T  ,  ...    ,    T 

obtained  from  X  .  X_ ,  . . .  .  X   is  constant  within  each  family 
1   2'     '   n 
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.  <  (f  ,  f  ,    ...    ,  f  ).   This  is  lemma  3-2 


(b)   the  power  of  any  rank  test  depends  only  on  f  ,  f  ,  ...    ,  f  , 
and  that  uniformly  most  powerful  tests  exist.  This  is  Theorem  3-1- 

Because  of  the  one-to  one  correspondence  between  rank  vectors  and 
sequential  rank  vectors  properties  (a)  and  (b)  are  preserved  for  sequen- 
tial ranks.   The  reason  for  this  is  that  in  computing  sequential  rank 
vectors  we  are  merely  identifying  different  points  in  n  -  dimensional 

space  with  each  possible  ordering  X.   <X.   <...<X.    than  when 

Xl    X2 
ordinary  rank  vectors  are  computed.   Thus  the  probability  associated 

with  any  subset  of  ordinary  rank  vectors  can  also  be  associated  with 

a  unique  subset  of  sequential  rank  vectors  and  we  have,  analogously  as 

in  [1], 

Theorem  3.1.   Given  n  functions   f  ,  f  ,  ...    ,  f   and  any 


sequential  rank  test  of  the  hypothesis  H:   (F  ,  F ,  ...    ,  F  )  € 

r.  (f  ,  f  ,  ...    ,  f  )  (i.e.  a  test  based  on  the  sequential  ranks),  the 

power  of  this  test  depends  only  on  f  ,  f   ,    ...    ,    f  •   That  is,  if 

(F,  ,  F^,  ...  ,  F  )   and   (F',  P'.  ...  ,  F' )   belong  to  the  same  class 
v  1'      2'  '   n         1   2'  '      n 

V'  (f.,  »  f„i  •••  ,  f  )   the  test  has  the  same  power  against  these  two 
v  1'  2  '      n 

alternatives.  Furthermore  given  any  class  of  alternatives  K: 
(F  ,  F  ,  ...  ,  F  )  e  V  (f'   f'   ...  ,  f)   there  exists  a  uniformly 
most  powerful  test  based  on  the  sequential  ranks  for  testing  H 
against  K. 

When  X  ,  X  ,  ...  ,  X   are  independent  and  identically  distributed 
the  sequential  ranks  are  independent  with  distribution 


P(Z.  =  k)  =  l/i      k  =  1,  2,  ...  ,  i      i  =  1,  2,  ...  ,  n  . 
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A  proof  of  this  is  given  in  [2],   We  see  that  the  mapping  defined  in 
(3-1)  takes  the  vector  of  dependent  ranks   (T  ,  T  ,    ...    ,  T  )   into  the 
vector  of  independent  sequential  ranks  (Z   ,    Z  ,    ...    ,  Z  ).   Thus 
according  to  Theorem  3.1  and  the  discussion  leading  to  it  we  lose  nothing 
in  the  matter  of  hypothesis  testing  by  considering  sequential  ranks 
instead  of  ordinary  ranks,  and  in  fact  when  we  are  dealing  with  inde- 
pendent and  identically  distributed  random  variables  we  find  that  the 
sequential  ranks  are  independent. 

Since  there  is  a  one-to-one  correspondence  between  the  ordered 
observations  and  the  sequential  rank  vector,  the  distribution  theory 
for  sequential  rank  vectors  is  also  completely  specified  by 

P(X.   <X   <  ...  <  X.  )  =  /  ...  /  J  dF.  (x.  ) 

\~     ±2~  ~     Xn       <  J  J  <  ,  A   Xj   Xj 

-oo<X.   <X    <  .  .  .  <  X ,   <°° 

1     2  n 


(3-2) 


n 

df.  (F(x. 

-00   •  J=1       j  J 

\~  -     in 


0  <  yn-    <  y,    <  •••  <  y<    <  x  J_1     J     J 

1     2  n 


where  v   =  F(x   )  and  the  X  are  assumed  to  be  independent  in  this 
i .      i .  i 

J       J 
calculation.   Let  f  =  (f  ,  f  ,  ...  ,  f  )  and  write 

P(X  <  X  <  ...  <  X  )  =  P(f).   The  distribution  function  for  the   nl 

vectors   (Z  ,  Z   ,    ...    ,    Z   )      is  obtained  by  computing  P(f)   for  all 

possible  permutations  of  the  components  of  f .   In  order  to  determine 

the  marginal  distribution  for  Z.   we  notice  that   Z.  =  k  if  only  if 
D  11 
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X.   is  the  k    smallest  among  the  first   i  observations,  and  we  get 
(3-3)     P(z  =  k)  =  £p(f)      f  =  (f  ,  f  ,    ...  ,  f.  ) 

Jl   J2         Ji 

where   f .   is  the  k    coordinate  of  f  and  the  summation  is  taken 

over  the   (i-l)l  permutations  of  the  coordinates  leaving  f.   fixed  at 

,  th 
the  k    coordinate . 

For  the  special  case  where  the  X.   are  taken  to  be  identically 

distributed,  we  can  take  f.(x)  =  x  without  loss  of  generality,  and  it 

is  easy  to  compute  (3 '2)  and  (3 .3)  to  get 


(3.k)  P(f)  =  l/nl 

and  P(Z.  =  k)  =  l/i    k  =  1,  2,    . . .  ,  i f    i   =  1,  2, 


yielding  the  independence  of  Z  ,  Z  ,  ...  ,  Z   as  noted  above. 

Another  special  case,  to  be  used  later,  is  when  the  f .   are  taken 

a . 
to  be  the  Lehmann  alternatives,  introduced  in  [l].   We  let  F.(x)  =  F  (x) 

a.  >  0,   and  in  this  case  a  straight  forward  computation  gives 

n 


n  «i 


i=i 


(3  =  5)  P(XX  <  X2  <  ...  <  Xn)  =  — 


n  ^  a- 

i=l  J 


l 

a 

J 


By  relabeling  the  X's,  the  probability  of  any  order  of  the  X's  can 
be  found  using  (3'5)>  giving  all  the  values  needed  in  (3»2)  to  specify 
the  distribution  of  the  sequential  rank  vectors. 
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^ •     An  Application  of  Sequential  Ranking  to  Hypothesis  Testing. 
In  the  nonparametric,  fixed  sample  size,  two  sample  problem,  it  is 
assumed  that  there  are  available  two  sets  of  observations   (X  ,  X  , 
...  ,  X  ]   and  (Y  ,  Y  ,  ...  ,  Y  )   each  set  from  some  probability- 
distribution .   The  problem  is  to  test  the  hypothesis  that  the  distri- 
butions are  the  same,  against  the  alternative  that  they  are  different. 
Usually  the  alternative  is  more  restrictive  as  when  only  a  shift  in 
location  is  considered.   In  this  section  we  consider  the  nonparametric 
two  sample  problem  as  a  sequential  problem  rather  than  fixed  sample 
size. 

Let  X.   i  =  1,  2,  ...      and  Y.   j  =  1,  2,  . . .      be  independent 
-*-  J 

random  variables  and  assume  we  wish  to  test 

H:   G  =  F   against   K:   G  =  f(F) 

where  F  is  the  continuous  cumulative  distribution  of  the  X's  and 
G  the  continuous  cumulative  distribution  of  the  Y's.   We  propose  to 
use  the  sequential  probability  ratio  statistic  based  on  the  sequential 
ranks  and  we  can  assume  the  observations  to  be  taken  alternately  as 

X1,  Yx,  X2,  Y2,  ...  ,  Xn,  Yq,  ...   . 

Let  Z  =  (Z  ,  Z  ,  . ..  ,  Z  )  be  the  sequential  rank  vector  based  on  the 

first  N  observations  and  write  P^Z  )/PQ(z  )   as  the  sequential 

probability  ratio,   P   referring  to  the  alternative  to  the  hypothesis, 

P   to  the  hypothesis, 
o 

Under  the  hypothesis  P(ZN  =  z)  =  l/NL   and  Pq(Z  )  =  l/NI    Under 

N 
the  alternative  we  can  compute  P(Z  =  z)  by  noting  that  each  outcome 
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vector  z  corresponds,  in  a  one-to-one  manner,  to  a  particular  order 

of  the  X's  and  Y's.   For  example 


Z3  =  (1,  1,  1)  o  X2  <  Y1  <  X±,  7?   =  (1,  2,  1)  <^    X2  <  X1  <  Y1 

Each  Z   in  turn  corresponds  to  a  vector   ((F,  G,  F)   or   (F,  F,  G) 
as  in  our  example)  of  F's  and  G's  meaning  that  the  observation 
appearing  in  the   i    smallest  position  in  the  ordering  of  X's  and 
Y's   has  the  distribution  F  or  G  according  as   F  or  G  appears 
as  the   i    coordinate  of  the   F,   G  vector.   Thus  to  compute 
P(Z  =  z)   for  all  possible  values  of  z  we  need  only  compute 


P(U-l<U2  <  ...  <UN) 


where  U.  is  an  X  or  a  Y  according  to  the  outcome.  In  particular 
when  f  is  a  continuous  increasing  function  on  the  unit  interval  with 
f(0)  =  1  -  f(l)  =  0,  the  probability  distribution  is  constant  for  all 
continuous  distributions   F  and  depends  only  on  f .   In  fact  we  have 


p(ux<u2  < ...  <uN)  =      /•••/       n  **±w\)) 


-oo<t<...<t<°° 


N 

n 

i=l 


0  <  y,  <  ...  <  y,T  <  1 


N 

n  «i<*i> 

i=l 


by  letting  y.  =F(t.)   where   f.(F(t.))  =F(t.)   when  U.  =  X.   and 

f.(F(t.))  =  f(F(t.))  when  U.  =  Y.  . 
iv  v  i  i  l    l 
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In  the  special  case  of  Lehmann  alternatives  f(x)  =  x  ,   a  >  0 
and  by  (3.5)  we  get,  for  N  even, 

/ 


VzH>  ■  7 


N/2 


n    i 


A 


i=l 


j=l 


where    A .  = 


i- 


1   if  U.  =  X. 
l    l 


a   if  U.  =  Y. 
l    l 


and  the  probability  ratio  reduces  to 


»■   N/2 

Ni  a  ' 


„  /„N.   "  N  /  i 


A  similar  result  holds  for  N  odd.   The  vector 


\  -  (Ax,  Ag, 


V 


N 


(1»        \ 
Z  =  z) 


for   [(N/2)l]   outcomes   z  out  of  the   Ni   possible.   We  can  compute 
the  probability  ratios  at  each  stage  using  the  following  relations: 


(*.!>    S,-^- 


N-l 


NI  a 


N  /  1 

i=i j 


NI  a 


N/2 


N  /  i 


nu  a 


i=i 


j=i 


N  odd 


N  even 
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(*-2) 


S  -  *i 

N+l   ""  P 
o 


N+l 


(N+i): 


Z=.l/  i         \        N 

i  a.i)  n  («+  za. 


w>' 


i=Z-l 


(H+Dl   a"/2 


Z-l/  i       \       N     /  i 

nc.iO  n  (i+  ^ 


1 


i=i  j=i 


i=Z-l 


j=l 


N     odd 


N     even 


where   Z  =  sequential  rank  of  Y    ,      N  odd,  and  Z  =  sequential  rank 

"at 

of  X  p  ,   N  even.  At  the  N+l    observation  Z  is  determined 

st  st 

and  if  Z  =  k,   the  N+l    observation  came  between  the  k  -  1 

and  the  k    smallest  observations  of  the  preceding  N  observations. 


Thus 


Vi =  (V  V  •■•  '  Vi'  A*'  V  •••  '  V  vhere  A* =  x  if 


st 


the  N+l    observation  is  an  X  and  A*  =  a  if  the  observation  is 


a  Y.   Using  (^.l)  and  (^+.2)  and  Z  we  can  pass  from  S   to  S     as 
the  observations  are  taken.   For  example   S   =  1 


s2  =   < 


2a/l+a   if  X  <  Y   «-»  A  =  (l,  a) 


2/l+a   if  Y  <  X   «-»  A2  =  (a,  l) 
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(l+a)(2+a) 


if 


(l+aj(2+a) 


if 


\<\<  x2 


X2  <  Y  <  X 


Yl  <  Xl  <  X2 


Yl  <  X2  <  Xl 


Xl  <  X2   <  Yl 

3a 
(24a) 

if 

X2   <  Xx  <  Y1 

*»  A  =  (1,  a,  1) 


«*  A^  =  (a,  1,  1) 


We  noted  before  that  under  the  hypothesis  a  =  1  the  sequential 
ranks   Z  ,  Z  ,    ...  ,  Z   are  independent.   However,  when  a  ^  1  we  do 
not  have  this  independence  property.   Consider  the  case  N  =  3  where 
we  observe  X  ,  Y  ,  X   in  that  order.   The  possible  outcomes  are 


Ordered  observations 


Xl  <  X2  <  Yl 


X2  <  Xl  <  h 


Xl  <  Yl  <  X2 


X2  <  Yl  <  Xl 


Y1<X1.<X2 


Yl  <  X2  <  \ 


Sequential  ranks 

(1,  2,  2) 

(1,  2,  1) 

(1,  2,  3) 

(1,  1,  1) 

(1,  1,  3) 

(1,  1,  2) 


Probability 
a/2(2+a) 

a/2(2+a) 

a/(l+a)(2+a) 

a/(l+a)(2+a) 

l/(l+a)(2+a) 

l/(l+a)(2+a) 


and  the  marginal  distributions  are  easily  computed  as 


2U 


P(ZX   =  1)    =  1  p(Z5   =  1)    =  a(3+a)/2(l+a)(2+a) 

P(Z2    =  1)    =  l/l+a  P(Z     =  2)    =   (2+a+a2)/2(l+a)(2+a) 

P(Z2    =2)    =  a/l+a  P(Z     =  3)    =  l/2+a 

Now  P((ZX,  Zg,  Z  )  =  (1,  1,  1))  =  a/(l+a)(2+a)   and  P(  Z±   =   1) 

P  (Z2  o  1)   P  (Z3  =  1)  =  (l+a)^2+a)   2X^y-  and  it  follows  that 

Z   ,      Z  ,  Z   cannot  be  independent  unless  a  =  1  since  independence 

of  Z   ,    Z   ,    Z        implies   (3+a)/2(l+a)  =  1  which  in  turn  implies  a  =  1, 

Thus  we  have 


The 


orem  ^-.1   Let  X  ,  Y  ,  X   ,    ...  ,    X  ,  Y„  be  independent  random 

variables  with  X.   distributed  according  to  F  and  Y.   distributed 

l  l 

according  to  F  ,      a  >  0.   The  sequential  ranks  based  on  such  a  sequence 
are  independent  if  and  only  if  a  =  1. 

As  an  illustration  of  the  sequential  probability  ratio  test  based 
on  the  sequential  ranks  consider  the  data  given  below. 

Xx  =  3-926         X1Q  =  4.08         Y1   =  h.70  Y1Q  =  1.56 

X2  =  3.^5         X1;L  =  3.67         Y2=i+.15  Y11=4.29 

X2  =2.00          Xno  =  2.9^         Y  =  ^-55  Y1Q  =  1-7^ 

3                12               3  l^ 

X^  =  2.28         X15  =  5-90         Y^  =  3-31  Y15  =  2.17 

X5  =  3.h9h                  X±k   =  2.18        Y^  =  2.13  Ylh  =   1.97 

X6  =  4.25         X   =  5-39        Y6  =  U.686  Y   =  1+.689 

X?  =  2.382         Xl6  =  2.7U         Y?  =  2.68  Yl6  =  2.87 

XQ  =  3.02          X   =  3.^92        Yq  =  2.36  Y1?  =  3.17 
X9  =  3.26          XlQ  =  2.70         Y9  =  3.93 
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The  data  is  taken  from  Table  600A  page  600  of  "Statistics,  A  New 
Approach,"  W.  A.  Wallis  and  H.  V.  Roberts,  The  Free  Press,  Glencoe, 
Illinois.   If  we  assume  X  has  some  continuous  distribution  F  and  Y 


has   F   as  a  distribution  then 


P(X  <  Y)  =  /   F(y)  dFa(y) 


1+a 


Suppose  we  consider  a  =  K,        P(X  <  Y)  =  .8  as  the  alternative  to  the 
hypothesis  a  =  1.   We  take  as  boundaries  for  the  sequential  probability 
ratio  test 

A  .  i_I_P  =  1  -  -03  =  19 

a  .05 


-  =  .0526 


1  -  a   1  -  .05 

and  if  S  <  B  we  accept  H:   a  =  1,   if  S  >  A  we  accept  K:   a  = 
and  if  B  <  S  <  A  we  take  another  observation  and  compute   S   , 
repeating  the  test.   Using  the  computational  formulas  (4.1)  and  (k .2) 
we  get 


sl  = 

1 

S2    " 

1.6 

V 

2.0 

\  = 

3.2 

s5- 

k.15 

S6  = 

6.6$ 

V 

8.75 

S8  = 

k.O 

V 

3.31 

S10 

=    .809 

su  = 

Mh 

S12    " 

.725 

S13  = 

.764 

slk- 

.586 

S15  = 

.lt67 

S16  = 

.23lt 

S17  = 

.168 

S18  = 

.21*2 

S19  = 

.138 

S20  = 

.0288 
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and  since   S   <  .0526  we  accept  H  at  the  20   observation. 

Notice  that  even  though  the  probability  ratio  S   is  written  as 
a  function  of  the  sequential  ranks,  in  (4.l)  and  (h  .2) ,  it  can  also  be 
computed  as  a  function  of  the  order  configuration.   By  this  we  mean, 
for  example,  the  order  configuration  1  a  1  stands  for  X  <  Y  <  X 
or  X2  <  Y±  <   X   and  all   stands  for  Y1  <  X  <  Xg   or  Y±  <  X     <  X^ 
Each  order  configuration  determines  a  value  of  S   as  a  function  of  a. 
It  can  happen  that  for  some  value  of  a  f   1  and  two  different  configu- 
rations,  S   takes  on  the  same  value.   As  an  example  consider  N  =  6 
and  the  configurations   a  1  1  1  a  a  and  1  a  a  a  1  1.   The  denominators 
in  S/-  for  these  configurations  are 

gl(a)  =  a(a  +  l)(a  +  2)(a  +  3)(2a  +  3)(3a  +  3) 

g2(a)  =  1(1  +  a)(l  +  2a)(l  +  3a)(2  +  3a)(3  +  3a) 

respectively.   For  a  =  l/2   and  a  =  2  we  get 

gi(l/2)  =  g2(l/2)  =  9^5/8    and    g±(2)    =   ^(2)  =  7560. 

Let  c(t)  be  the  number  of  different  configurations  such  that 
S  =  t .   We  have 

r-i 
[|]:fn-  [|]YS  c(t)aL2J 

(4.3)  P(SW  =  t  |  a) 


N     '  '  N  /  i 

I  a. 


where  the  a . ' s   correspond  to  any  particular  configuration  making 

S   =  t  (a   =1  or  a  according  as  X  or  Y  is  in  the   j    place). 
N     v  J 
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(4.3)  follows  because  any  two  configurations  which  make  S  =  t  have 
the  same  probability  under  the  alternative  to  the  hypothesis.   Under  the 
hypothesis 


(k.k)  P(S.T  =  t  |  a  =  1) 


[f]'.(N-  [fj)l  c(t) 


N     '      '  Nl 

In  (4.3)  and  (k .k)    [x]    is  the  greatest  integer  function. 

In  Wald's  sequential  probability  ratio  test  the  approximations 

A  <  — — —  and  B  > —   are  valid  when  the  probability  of  termination 

of  the  test  is   1.   These  inequalities  were  derived  under  the  assumption 
that  the  basic  sequence  of  probability  ratios  was  determined  from  an 
independent  sequence  of  observations  and  that  the  sequential  probability 
ratio  at  the  n   observation  is  formed  as  a  product  of  independent  and 
identically  distributed  random  variables.   Under  the  alternative  hypo- 
thesis we  have  found  that  the  sequential  ranks  are  not  independent. 
Thus  we  must  now  show  that  the  test  terminates  with  probability  1  in 
order  to  interpret  0!  and  P  as  error  probabilities. 

It  is  enough  to  show  that  the  test  terminates  with  probability  1 
considering  only  N  even0   For  N  even  we  can  write 


(4.5)  a;1-  ft  a"1/2i    z A 

1=1  J 
1=1        J 


and  define     AN  =  t     t  A.     with     Y^  =  a"1/2  AN     and     ZN  =  log  Y^ 
11.-,    J  1  1  i°i 


We 


0=1° 


consider  first  the  case  where  the  null  hypothesis  is  true.  A,, 
A ,    ...    ,    A^     are  dependent  random  variables  with 
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(^•6)  P(A.    =  1)   =  P(A.    =  a)   =  1/2 

J  J 


;iving     E(A   )    =  E(A^)    =  ^-±-S   .      For     i  f  J     ve  have 


P(A1.1,AJ'.1).P(A1-.,AJ..).^ 

P(A.    =  1,   A.    =  a)   =  P(A.    =a,Arl)=Jl 

la2 
and  a  simple   computation  gives     Var(A.)    =    (-p— )       and     Cov(A.,    A.) 

_1     <  !"a  f          at          Wv^          -1/2  wT1^        a"1/2   +  a1/2 
=  N^l(~2~  ^      '  °        ^V    =a  E^Ai^    =  2 an 


Var(Y?)    =a2-Var(A^)    =  ^_    |     £  Var(A   )    +2     £         J]     Cov(A       A)| 

1      i  x      r      l  .i=i        J         .1=1  k=i+i        J     k  J 


-If,         2 

a 

2 


f.    /1-a,  ,.2         ,x      -1    /1-av     1 

ji   (— )      +  (I      -   i)  —   (— )     j 


=   I  (±1*L  )     JL-  ^zl 

"  a       2  N-l      i 

and  notice   that     Var(i. )      is  decreasing  in     i     as      i  =  1,  2,    . . .    ,   N. 

If     1  <  a    then    1  <  AN  <  a     and     a.'1'2   <  Y^  <  a1' 2 .      If     a  <  1     then 


V  <  x?  <  a"1/2 . 


a 

—     1  — 

In  order  to  show  that  the  test  terminates  with  probability  one  it 
is  enough  to  show  that  S   -» <»  in  probability.   Thus  for  arbitrary 
positive  B  we  show  that 
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lim  P(S"1  <  l/B)  =  11m  P(log  S"1  <  log  l/B)  =  0 


Let  K  =  log  l/B  and  use  Chebyshev's  inequality  to  get 


p(Z  Z?<K)  =  pf  l    Z»-     I    ■(!?)<  K  -  Z  .(!*)) 

vi=l   x    '     \  i=l   x   i=l    x       1=1     ' 


N  „    N 
<  P  ' 


0   IzJ  -  £e(zJ)  I  >-K+  ZE(zJ)) 
^  1=1     1=1  1=1     ' 


N 
by  taking  N  large  enough  to  make  K  -  2,  E(Z.J  <  0.   This  can  be  done 

since   i .   is  bounded,  and  bounded  away  from  0,   we  have 


ZM  .  log  y"  -  log  X  ♦  4^  -  (1?  -  *  )2  f  ^  T 

i        i        a     X         1    aVeN/ 

a  6. 


N  a"l/2  +  al/2  N 

where   X.   =  E(r. )  =  - >  1  and  |.   is  bounded  away  from  0, 

a    v  l         2  l 

and  further 

E(ZJ)  >  log  X&    -  C  ^y  C  >  0 


N    N 

J  E(Z  )  =  N  log  X   -  0(log  N)  =  0(N)  >  0 

1=1    X  a 


Thus 

Var  (  I   Z" 
N  „    \        V  .  ,  i 
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and 


Var  (  I  Zj  )  =  I  Var(Z^)  +  2  £  Cov(zJ,  A 

M=l   y   1=1     x      i<J     x   J 


N 


<  IVar(ZN)  +2   J  (Var(ZN)  •  Var(ZN))l/2 
1=1     X      1  <  j      X  J 


Now,  expanding  log  T.      in  only  two  terms 


l 
Var(ZN)  =  E(log  i?  -  E(log  Y?)  )2  :  E(log  J?  -  log  X  f 


Y11-  X 


<  =■  Var(^)  .  c»  Jfa 


and   .  / -  ..v   is  decreasing  in  i.  Now  we  can  write 
i(N-l) 


Var  (   X  Z   )       O(log  N)  +  2  c  J  (N-i)  Var(Z  ) 
\  i=l  X  '  i=l  1 


=     0(N  log  N) 
and  finally 


p/»  n<k\       o(n  log  n)  ^0  as  N_ 

Vitl   i-    J  OtM2) 


N 
Since  log  S~   =  £  Z.  we  have  S"  -»  °°  in  probability,  and  when  the 
i=l  1 
null  hypothesis  is  true,  the  test  terminates  with  probability  1. 
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More  generally  now  consider 


(U.8)  pw  .  p(a   :  s;1  :  b"1)  =  p(kx  -  uN  <  log  s"1  -  ^  <  k2  -  n  ) 


N 


-  N  - 


where  K±   =  log  A  ,   K2  =  log  B_1  and  ^N  =  E(log  S"1).   For  lar 
enough  values  of  N 


(*.9) 


PN<   < 


P(|log  S 


-1 
N 


^N1  >^N"  K2)  < 


Var(log  S;  ) 


N 


W  "  K27' 


Var(log  S"1) 

N1  -  "l   "N7  -  ~Z  72~ 

(K1  "  V 


P(|log  S"-1  -  u  I  >  K   -  nj  < 


if    n 


N 


if    u  X   -  oo 


The  test  will  terminate  with  probability  1  as  long  as  P  -»  0, 
and  this  is  independent  of  the  true  distribution  of  the  Y  population 
since  the  inequalities  in  (^.9)  were  obtained  without  reference  to  the 
distribution  of  the  Y's.   In  particular  we  found  that  when  the  X  and 
Y  populations  are  identically  distributed,   n   =  0(N)   and 
Var(log  S"1)  =  0(N  log  N) . 

The  method  just  given  to  show  that  the  probability  of  termination 
of  the  test  is  one  is  not  satisfactory  for  all  alternatives  since  the 
verification  of  condition  (^-.9)  is  difficult.   We  now  consider  a  better 
approach.   As  before  take  N  even  and  write  the  probability  ratio  as 


(^.10) 


TN  =  SN 


1 


N 


1=1 


a^i  I  A. 
1  &     ° 
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In  order  to  show  that  the  probability  of  termination  is  one  it  is 
enough  to  show  that  N   log  T   converges  to  some  non-zero  constant 
since  for  fixed  boundaries  A,  B,  the  equivalent  formulation 


|  log  A"1  <  |  log  TN  <  |  log  B"1 

will  terminate  with  probability  one,  provided  N   log  T   converges  in 
probability  to  some  non-zero  constant. 

Let  2n  =  N  and  let  Z  ,    Z  ,    ...  Z   be  the  order  statistics  for 
the  combined  sample.   Define  the  empirical  cumulative  distribution 
functions  for  the  X's  and  Y's  as 


_  t,\         (number  of  X's  <  t] 
F  (t)  =  

n  n 


(f\    _  (number  of  Y's  <  t] 
n  n 


i 
Since  £  A.  =  (number  of  X's   in  Z  ,    Z  ,    ...  ,  Z.}  +  a(number  of 

j=l  J 

Y's   in  Z„  .  Z„,  ...  .  Z.l   we  can  write 

1   2'  '      1 


\     VA.^f  (Z.)  +^  a  G  (Z.) 
i  A,  .1   i  nv  i'   i    n  i 


j=i° 


and 


|  log  TN  =  -  \   log  a  -  log  2  +  log  N  -  \   log  NI 


1  N 
+  zz     I  log  (F  (Z.)  +  a  G  (Z.)) 
in  . "       n   i       n   l 
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Since   lim  (log  N  -  -  log  Nl)  =1,  we  have 
N  — »  oo 


-1  1   iN 

lim  N  "  log  T  =  log  e/2/i   +  lim  -  J  log  (F  (Z  )  +  a  G  (Z  )) 

N  ->  oo  -w  N  ->  co  iN  i=l      n   x       n   x 


=  log  e/2v^  +|   /   log  (F(x)  +  a  G(x))  (dF(x)  +  dG(x)), 


the  latter  limit  following  from  a  result  of   I,  R.  Savage  and 
J.  Sethuraman  communicated  to  the  author  by  Sethuraman  as 


The 


orem  (Savage -Sethuraman)   Let  X  ,   X   ...  X  ,  Y  ,    Y ,    ...  Y 


he  independent  random  variables  where  the  X.   are  distributed  according 

to  the  continuous  distribution  F  and  the  Y.   according  to  the  contin- 

l 

uous  distribution  G.   Let  Z  ,    Z  ,    »..  ,    Z^  (N  =  2n)  be  the  order 

statistics  of  the  combined  sample  and  let  F   and  G   be  the  empirical 

n        n 

cumulative  distribution  functions  of  the  X's  and  Y's  respectively. 
Then 


N  POO 

N         X  log  (Fn(Z.)  +  a  Gn(Z,))  -» |   /   log(F(x)  +  a  G(x))(dF(x)  +  dG(x)) 

i=l  "-co 


in  probability,   (see  [10]) 

In  our  case   G  -   F  or  F   depending  upon  which  hypothesis  holds. 
However  we  will  consider  the  entire  class  of  alternatives  F  ,     b  >  0 
which  could  hold.   Let  N"   log  T  -*  L  (b).   Then 

1M      3. 


^ 


.         r>  oo  f  "b « 

L   (b)    =  log  e/2/a~  +  %  log(F  +  aFb)   d   (F  +  aF   > 


2 


roo 
log(F  +  aFb)   d((l   -   l/a)F) 

(^•12) 

=  log  e/2Va   +  ^g      \  log  t  dt   +  gi   /        log(t   +  atb)dt 

0  0 

=  -  log  2  -  \   log  a  +  i*S  log(l+a)  +  §^  ,f  "  log(l+atb_1)dt 

0 

r1  t-i 

The  function   /   log(l+at   )dt   decreases  as  b  increases,  and  thus 

0 
L  (b)   is  monotone  in  b,   decreasing  when  1  <  a.,      and  increasing  when 

a  <  lo 

Under  the  null  hypothesis  b  =  1  and 

-1/2    1/2 
L  (1)  =  log -^ >  0  for  a  ^  1  . 

El  <— 

Under  the  alternative  hypothesis  b  =  a  and 


-1/2    1/2  ,    .2  rl 

j    (    \        i    a     +  a  (a-l)   /      1     ,. 

L  (a)  =  log -        /   —  dt. 

a  ^  d      J  0  a  +  t±_a 


In  order  to  show  that  the  test  terminates  with  probability  1  we  must 

have  L  (a)  ^  0  for  a  /  1.   In  fact  we  will  show  that  L  (a)  <  0 
a  a 

for  a  f   1.   Notice  first  that  it  is  enough  to  consider  0  <  a  <  1 
since 
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r    „/    1        ,        a1/2    +  a"1/2  (a"1-!)2  f1  1 

Ll(l/a)   =  log  g i i-  _!         !,l/a  dt 

-  -  0     a       +  t 

a 

i        a"  '      +  a_  (a-l)         T  a 

=  log  — -  ^     g/        '  ^^     dt 

^  2a^         -0  1+at1  1/a 


-1/2          1/2        .      nN2  pi       2     a-l 

,        a            +  a              (a-l)  /as                       /,          aN 

=  log -  ^ — ^—      /        — -  ds  (t   =  s    ) 

2a~  J0     1+a   s 


.        a"1/2   +  a1/2        (a-l)2    f1  1  „  .    ,    , 

log  g -  A-g-J-  J        — TT-  ds    =  Lja) 

^  0     a+s 


We   can  write 


tiA   ,.  ^  fx_ 


2La(a)    .  log h    +  ^    1  .    (a.!)'      ,        _^_ 


J  o     a+t" 


dt 


r1  (*-i)2  3  dt.  r'j-iiL.t 

0     4a   +   (a-l)    t  J0  a+ti_a 


=  (a_1)    /     I 2~  "  — iTa"  J  dt 

70     \  ka   +   (a-ir   t        a+tX  a  / 

1-a  2 

and  we  wish  to  show  that  a+t    <  ka   +   (a-l)   t  for  0  <  t  <  1  and 


0  <  a  <  1.   Define 


h(a,  t)  =  3a  +  (a-l)2t  -t1  a 


and  notice  that 


36 


||  =  (a-1)2  -  (1-a)  t"a  <  (a-1)2  -  (l-a)  =  (a-l)  a  <  0 


^=a(l-a)  t"(a+l)  >0 

St2 


Since  h(a,  0)  =  3a,   h(a,  l)  =  a(a  +  l)   we  may  conclude  that 

h(a,  t)  >  0,   which  makes  the  integrand  in  2L  (a)   negative  as  was  to 

cl 

be  proved. 

We  have  shown  that  the  sequential  test  terminates  with  probability 
one  under  the  null  and  alternative  hypothesis  and  moreover  the  test  will 
terminate  with  probability  one  when  the  Y's  are  distributed  according 
to  F   for  b  >  0  except  possibly  for  only  one  value  of  b.   This 
follows  from  the  monotonicity  of  L  (b). 

We  also  remark  here  that  for  a  fixed  sample  size  test  of 

lHQ:      X  ~  F,   Y  ~  F 
against  ^    :      X  ~  F,   Y~Fa   a  =f  1,      a>0 


using  ranks  of  observations,  the  Neyman- Pear son  theory  would  give  a 
most  powerful  test  of  the  form 

accept  pyn  for  S   >  K  . 

An  equivalent  test  would  be  to  accept  $HL   if  t;   log  S   >  -  log  K. 

Assume  a  >  1  and  let  L  (b  )  =0.   Then 

ax  o 
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lira  P(|  log  S"1  >  |  log  K)  =  1   if   Y  ~  Fb   b  <  b 
N  — >  °° 


lim  P(|  log  S'1  <  |  log  K)  =  1   if   Y  ~  Fb   b  >  b 
N  ->  °° 


and  thus  for  a  test  of  the  composite  hypotheses 


agains 


H^:   X  ~  F,  Y  ~  Fb   0  <  b  <  b  (a) 
0  0 


t   "H  :   X  ~  F,  Y  ~  Fb   bQ(a)  <  b 


the  test  is  consistent 
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5«  The  Signed  Sequential  Rank.  We  now  extend  the  ranking  procedure 
defined  in  Section  3  to  include  the  sign  of  the  observation.  This  corre- 
sponds to  the  signed  rank  statistic  used  in  fixed  sample  size  problems. 

Definition  5-1   The  signed  sequential  rank  of  X   relative  to 

n 

X  ,  X_,  ...  ,  X   is  the  product  of  the  sequential  rank  of   |X  |   relative 
to   |X |,  |X \,    ...  ,  |X  |   and  sign  (X  ),   where  sign  (X  )  =  1  if 

X  >  0  and  sign   (X  )  =  -1  if  X  <  0. 

n  -  n  n 

In  the  case  of  sequential  rank  vectors  there  are  NI   points  in  the 

sample  space  corresponding  to  a  sample  X  ,  X  ,  ...  ,  X   and  in  the 

N 
case  of  signed  sequential  rank  vectors  there  are  2  Nl   points  corre- 
sponding to  the  same  sample.   Of  course  if  the  basic  variables  (the  X.) 
are  positive  random  variables  (or  negative)  the  signed  sequential  ranks 
are  equivalent  to  the  sequential  ranks. 

We  found  in  Section  3  that  when  the  basic  random  variables  are 
independent  and  identically  distributed  the  sequential  ranks  are  inde- 
pendent-  This  result  does  not  hold  in  general  for  signed  sequential 
ranks  and  so  we  now  determine  a  sufficient  condition  for  this  result  to 
hold  in  this  case. 

Let  X  ,   X  ,    ...    ,   X   be  independent  and  identically  distributed 
random  variables  and  let  Z.  =  sequential  rank  of   |x. |   relative  to 
IxJ,  |X2|,  ...  ,  |X.|,   E.  =  sign  (X.)   with  Y.  =  E.  Z±,    i  =  1, 
2,  ...  ,  N.   If  F(x)  =  P(X  <  x)   satisfies  the  condition  in  lemma  2.2. 
E,,  |x  |,  Jx  \,    ...  ,  |x. |   are  independent  and  it  follows  that  E. 
and  Z.   are  independent.   Thus  we  get 
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P(Y.  =  j)    =   P(E.  =  1,  Z±  =  j)  =  P(E.  =  1)  P(Z.  =  j) 

(5.1)  =  (1-F(0))  1/i 

P(Y.  =  -j)  =  P(E.  =  -1,  Z.  =  j)  =  P(E.  =  -1)  P(Z.  =  j) 

=  F(O)  l/i 

for  J  =  1,  2,    ...    ,    i,  i  =  1,  2,    ...  ,  N.   P(Z.  =  j)  =  l/i  follows 
from  (3.4) 

We  will  now  show  that  the  condition  given  in  lemma  2.2  is  a 
sufficient  condition  to  guarantee  the  independence  of  the  signed 
sequential  ranks . 

Theorem  5.1.   If  X  ,  X  .  ...  ,  X   are  independent  and  identically- 
distributed  according  to  F(x)   where  F(-x)  =  F(o)[l-F(x)  +  F(-x)]   for 
all  x  >  0  then  the  signed  sequential  ranks   Y  ,  Y  ,    ...  ,    Y       are 
independent  random  variables . 

Proof:   Let   (i  ,  i  ,  ...  ,    i  )   be  an  arbitrary  outcome  vector  for 
(Y  ,  Y  ,    ...  ,    Y )      and  let  k  be  the  number  of  positive  integers  in 
(ix,  12,  ...  ,  iN).  We  have 


np(Y   -i  )  -  [i-F(o)]"  [f(o)]— 

1L     Km         m;  NI 

m=l 

from  (5.1).  Each  outcome  vector  corresponds  to  a  particular  ordering  of 

the  X's,  with  N-k  of  the  X's  negative.   The  absolute  values  of  these 

N-k  X's   have  a  particular  ordering  among  the  positive  X's.   So  each 

outcome  vector  is  equivalent  to  an  event  like  [0  <  e  X .  <  e  X .  <  . . .  < 

-L    J  "I      *-    op 

<  e  X   1  where  k  of  the  e.     are  1  and  N-k  are  -1.   The  distri- 
N   V  J 

bution  function  for  -X  is  l-F(-x)  and  using  F(-x)  =  F(o)[l-F(x)  +  F(-x)] 
we  have 

ko 


Hence 
P(Y. 


d{l-F(-x))  =  -dF(-x)  = 


■H  >    Yo  ~  io'  •  •  •  >    Ym  ::  0 


_F(0) 

l-F(O) 


dF(x) 


P(0<e  X  <  ...  <«BX  )  - 


o  <  yx  <  • • •  <  yN  < 


JLdFx.  <*!> 
1=1  Ji 


-iN-k 


o<  y1  < 


N-k 


P(0  <  X.   < 

-  Ji 


N 
JTdFx  (y.) 


.  <x4  ) 


JN 


F(O)  ' 

l-F(O) 


N-k  P(Q  <  X. ,  for  all  i  )    [F(o)]N-k  [3L.F(o)]k 

Ni 


Ni 


N 


Thus   P(Y1  =  i1,  Y2  =  i2,  ...  ,  YN  =  iN)  =  JT  P(Yj  =  ij)   establishing 

J  —  -1- 
the  independence . 

Remark:   In  the  proof  of  the  theorem  we  have  assumed  that  F(o)  /  1 
If  F(0)  =  1,   the  X.   are  negative  random  variables  and  the  signed 
sequential  ranks  reduce  to  {-(sequential  rank  of  |X.|)j,  which  are 
independent. 

The  condition  F(-x)  =  F(0)[l-F(x)  +  F(-x)]   for  all  x  >  0   is 
satisfied  by  distributions  of  positive,  negative  and  symmetric  (about  0) 
random  variables.   A  larger  class  of  distributions  satisfies  the 
condition.   If  ve  consider  all  measurable  sets  A  e  [0,  °°)   and  define 
-A  =  {x:   -  x  e  A] ,      then  the  condition 

Pr{X  €  A]  =  k  Pr{X  e  -  A)    k  >  0,   all  A 

is  enough  to  insure  that  F(-x)  =  F(o)[l-F(x)  +  F(-x)]   for  all  x  >  0, 

in 


since  taking  A  =  [0,  oo)  we  get  k  =  Z,/\\        and  taking  A  =  [0,  x]  we  get 
F(-x)  =  F(0)[l-F(x)  +  F(-x)]   for  all  x  >  0.   On  the  other  hand, 
starting  vith  F(-x)  =  F(o)[l-F(x)  +  F(-x)]   for  all  x  >  0  we  get 

dF(x)  ,1^101  d[_F(_x)} 
and 

Pr{X  e   A}  =  JdF(x)  =  ±^$-  J    d{-F(-x))  =  ^^  Pr(X  e  -  A]  . 


We  now  consider  the  asymptotic  distributions  of  sums  of  signed 

sequential  ranks  based  on  observations  from  a  distribution  satisfying 

the  condition  in  Theorem  5.1.   Let  X.,  ,  X^,  ...  ,  X  be  independent 

1'   2'        n 

identically  distributed  random  variables  with  common  distribution 
function  F(x)   such  that  for  all  x  >  0  F(-x)  =  F(o)[l-F(x)  +  F(-x)] 
holds.   Define   Y   =  signed  sequential  rank  of  X  .   Using  (5«l)  we 
get  easily 

E(Yn)  =  (1-2F(0))  ^^ 

(1  (1-2F(0))2  \ 

When  F(x)   satisfies  the  condition  of  Theorem  5.1  the  signed 

sequential  ranks  are  independent,  but  not  identically  distributed,  and 

n 
forming  the  partial  sums,   S   =  £  Y.   we  have 

n   i=l  X 


k2 


E(Sn)    ,^a»)B2  ^3(1-21(0))^ 

(5-3) 


k2 


Var(Sn)    =  ^-;(1-2F(0))'  ^  n(n+1)(2n+1)    +  fl-(l-2F(0»     \  n(n+l) 


,   A-6(1-2F(0))2A  L 


Now  for     e  >  0,      k  =  1,   2,    ...    ,   n,      a2    -  Var(S   )      it  follows   that  for 

'   '   n       n 

large  enough  values  of  n 


J  (y  -  E(Yk))2  dFy  (y)  =  0 

!y-E(Yk)|  >  e  an  k 


because  the  range  of  integration  becomes  a  set  with  zero  probability 

since   Y.   is  bounded  according  to   Y.   <  k  and  a     **   c  •  n    .   Thus 
k  k  —         n 

as   n  -»  oo  the  integral  is  zero  for  all  k  =  1,  2,    ...    ,    n  and  by  the 
Lindegerg-Feller  Theorem  it  follows  that  S   is  asymptotically  normal. 

If  we  normalize  the  signed  sequential  ranks  and  then  consider 
partial  sums  we  get 


n   Y  -  E(Y  ) 
n   1=1  [Var(Y.)]l/2 


and 


„  J  Yi  -  E<Yi'    ,  21  2 

"^72   *  — I i72    =  "TVa  - 


IVart^)^     ■  (V2  +  v  +   j*      ^  +  ^  +  aj/l2)  '     ^ 


1*3 


p 

as   i  -»  oo  where  cc     =  —  -     -* p* — '-L   .   Hence  the  normalized  signed 

13       4  B 

sequential  ranks  are  uniformly  bounded  and  by  the  bounded  Lyapounov 

Theorem  S*/-/n  is  asymptotically  distributed  as  a  unit  normal  random 

variable . 

As  was  noted  in  the  introduction,  some  statistical  problems  are 

concerned  with  detecting  a  change  in  the  distribution  of  a  sequence  of 

observations  obtained  from  some  process.   We  now  consider  the  case  where 

in  the  basic  set  of  independent  random  variables   (X  ,  X  ,    ...    ,    X  } 

the  first  m  are  distributed  according  to  F(x)   and  the  remaining 

N-m  are  distributed  according  to  G(x) .   As  before  let  Y.   denote 

1 

the  signed  sequential  rank  of  X..   Since  each  possible  outcome  vector 

for  (Yn.  Y_.  ...  ,  Y„J   corresponds  to  an  event  of  the  form 
12'        N 

[0<€   X    <€   X    <...<€   X   ] 

1        2  N 

where   e.  =  +  1  and  (i-,*  io>  ■••  >    ^tj)   ^s  a  permutation  of 

(l,  2,    ...    ,    N) ,      the  joint  distribution  of  the  signed  sequential  ranks 

is  obtainable,  in  principle,  from 

(5A)       P(0  <  e  X   <  6  X   <...<€  X   )=  P(F,  G,  e) 

1        2  N 

where   e  =  (e  ,  e   ,    ..-  ,  e  ).   In  general  (when  F  t   G)  the  Y.   are 
not  independent.   For  example  if  we  are  sampling  from  an  unknown  distri- 
bution F(x)   and  we  wish  to  detect  a  change  in  distribution  to  r  (x), 
a  >  1   (a  stochastically  larger  distribution)  where  F(0)  =  0,   we  lose 
the  property  of  independence.   In  this  simple  case  signed  sequential 
ranks  and  sequential  ranks  are  equivalent  and  taking  N  =  3  with  m  -   1 

we  have 
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P(YX  =  1)  -  1,   P(Y2  =  1)  =  ^  ,   P(Y3  =  1)  =  2(l+l)ll+2a) 

and  P(Y1  =  1,  Y2  =  1,  Y  =  l)  =  2^2b\    .   In  general,  for  a  >  1 

1     /  _1_     l+3a     _    1     l+3a 
2(l+2a)  *  1+a  2(l+a)(l+2a)  "  2(l+2a)  (1+aN2 

Since  there  are  cases  when  the  signed  sequential  ranks  are  inde- 
pendent we  now  determine  the  marginal  distributions  for  signed  sequen- 
tial ranks  in  the  case  where  a  change  in  distribution  from  F(x)   to 
G(x)   occurs  for  arbitrary  continuous  distributions  F(x)   and  G(x) . 
Let  X  ,    X  ,  . ..  ,    X   be  independent  random  variables  with  X. 
1  <  i  <  m  distributed  as   F(x)   and  X.  m  +  1  <  i  <  N  distributed 
as   G(x) .   Take  N  =  m  +  n,   and  let  Y.   be  the  signed  sequential 
rank  of  X.   and  H,(t)   be  the  distribution  function  of  the  k 
order  statistic  from  the  set   { |X  |,  |x  \,    ...  ,  X   }.   It  is  enough 
to  determine  the  distribution  of  Y  .   Using  lemma  2.1  and  P(|X  |  <  x) 
=  F(x)  -  F(-x)   for  x  >  0  we  get 

(5-5) 

H.  (t)  =  I       I  CX^)(F(t)  -  F(-t))J  (1  -  F(t)  +  F(-t))m"J 
K     i=k  j=0  J  1_J 

•  (G(t)  -  GC-t))1""5  (l-G(t)  +  G(-t))n-i+J-1 

t  >  0 


Now  let  Z   be  the  k    order  statistic  from  { |X  |,  |X  \,    ...  , 


XN_1 J } .   Then 


^5 


P(YN  =  1)  =  P(0  <  XN  <  Z1)  =  E(G(Z1))  -  G(0) 

G(t)  dH  (t)  -  G(0) 

0 

poo 

=  1  -  G(0)  -  /    H^t)  dG(t) 


Also  for  2  <  k  <  N  -  1 


P(YN  =  k)  =  H\_±  <   XN  <  Z^  =  E(G(Zk))  -  E(G(Zk_x)) 


roo  p  CO 

G(t)  dH.  (t)  =  1  -  /   H.  (t)  dG(t)   and 
k     JQ        is.  J  0      ^ 


/OO 
(Hk-i(t)  -  Vt)}  dG(t) 


k-1 


mw  n   ^  /  /w^  vt   +  uJ  t-\    j?(^\   j.   w  ^ \  \m_ J 


Z  Q(k  n  J     (P(t)  -  F(-t))J  (l-F(t)  +  F(-t))1 


(G(t)  -  G(-t))k"1_,j  •  (l-G(t)  +  G(-t))n_k+,j  dG(t) 


For  k  =  N  ve  get  P(YN  =  N)  =  P(ZN_X  <  XR)  =  1  -  E(G(ZN_X)) 

/OO 
H   (t)  dG(t).   For  negative  values  of  Y   we  can  calculate 
0 
p(Y  =  -k)  =  P(Z    <  -  X  <  Z  )   in  a  similar  manner  to  obtain  finally 

for  2  <  k  <  N  -  1 


k6 


Hx(t)  dG(t) 


P(YN  =  -1)  =  G(0)  +  /   Hx(t)  dG(-t) 

0 

k-1       -i    po° 
P(YN  =  k)  =  I     (j)(k.I.j)  /   (F(t)  -  F(-t))J  (l-F(t)  +  F(-t))m-J 

(5'6)  •  (G(t)  -  G(-t))k-1-'3  (l-G(t)  +  G(-t))n'k+J  dG(t) 


k-1  m  n_i         p°° 
P(YN  =  -k)  =  -  I  (j)(k-l-j)  /   (F(t)  "  F(-t))J  (l-F(t)  +  F(-t))m-J 
j=0  0 


(G(t)  -  G(-t))k"1_j  (l-G(t)  +  G(-t))n_k+j  dG(-t) 


P(YN  =  N)  =J      HN_x(t)  dG(t) 


P(YN  =  -N)  =  -J      H^t)  dG(-t) 


The  equations  given  in  (5.6)  can  be  written  in  one  formula  as 


k"!     n-1 


P(YN  :  ek)  :=  e     £  (j)(k-l-j)  /  Wt)    -   F(-t))J  (l-F(t)  +  F(-t))m"J 

j=0  o 

(5-7) 

•  (G(t)  -  G(-t))k_1"J  (l-G(t)  +  G(-t))n_k+J  dG(et) 


where   e  =  +  1  and  k  =  1,  2,    . . .    ,    N.   Verification  that  (5-7)  reduces 
to  (5-6)  in  the  case  k  =  1  and   e  =  +  1   can  be  accomplished  through 
the  following  result 


hi 


2    i  . 

Lemma 


Li-    I      SQ(i1)PJ(l-P)1,lFj41"J(l-0B-1+J-l   (N=m+n) 
1=0  j=0  J    J 

Proofs  Let  a  =  (j)(i*j)  P^l-p)*"*  q1_J  (l-q)n"1+j  and  recall 
the  convention  of  (^)  =  0  if  b  >  a.  Instead  of  summing  as  indicated 
we  sum  along  diagonals  and  get 


N   i        N  N-£ 
I   la..  =  Y       V  a, 
1=0  j=0      i=0  j=0 


N  N-J        .       .   ,       , 

1  1  ( J4)  p  (i-p)   q  (i-q) 

J5=0  j=0 


N      o  o   N-i 

2,  4)  a  U-q)    1  (j)  p  (i-p) 

^=0  '  J=0 


£  (n)  iV*)""'  Y (  •)  Pjd-P)m"j   since   (")  =  0  i  >  n 
i=0  j=0  J 


Since  0  <  i  <  n        the  upper  limit  in  the  second  sum  is  N  >  N  -  i>N 
-  n  =  m  implying  m  <  N  -  i  and  making  the  second  sum  always  equal 
to  1.   Using  the  binomial  theorem  a  second  time  gives  the  result. 

Letting  p  =  F(t)  -  F(-t),   q  =  G(t)  -  G(-t)   we  can  write 
H  (t)  =  1  -  [l-p]   [l-q]     to  complete  the  verification. 

Using  Lemma  5*1  and  (5-7)  we  can  compute  the  characteristic 
function  for  Y   as 
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iuYN, 


(5-8)  cp(u)  =  E(e   w)  = 


e1U[l-q(t)  +  q(t)  eiU]n'  '  [l-p(t)  +  p(t)  eiU]m  dG(t) 
0 


°°    -in  •    n-1  •    m 

e"1U[l-q(t)  +  q(t)  e"1U]     [l-p(t)  +  p(t)  e"1U]   dG(-t) 


where  p(t)  -  P(t)  -  F(-t)   and  q(t)  =  G(t)  -  G(-t)  . 

Differentiating  (5-8)  and  setting  u  =  0  we  get 

p  00 

(5-9)     E(Y  )  =  /   (1  +  (n-1)  q(t)  +  rap(t))  d{G(t)  +  G(-t)) 

J  0 

(5-10) 

E(Y|)  =  1  +  (n-1)  (2n6+  *>    +J   (3mp(t)   4-  2m(n-l)  q(t)  p(t) 


2 
+  m(m-l)  p  (t))  dq(t) 


The  marginal  distribution  of  Y„,   equation  (5-7),  holds  for 
arbitrary  continuous  distribution  functions  F  and  G  and  thus  (5-8), 
(5-9)  and  (5-10)  are  the  general  expressions  for  the  characteristic 
function,  mean  and  second  moment  of  the  Y  .   Thus  to  generalize  (5-2) 
to  arbitrary  continuous  distributions  F  we  let  F  =  G  in  (5>9)  and 
(5 .10)  and  we  get 

(5-11)  E(YN)  =  (N-1)  (j   -    GT(0)  -  j      G(-t)  dG(t)^)  +  1  -  2G(0) 

<5.*>  «<#-£♦!♦$ 
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6.   An  Application  of  Signed  Sequential  Ranking  to  Process  Control. 
As  stated  in  the  introduction,  in  the  process  control  problem  we  wish 
to  determine  a  procedure  which  will  determine  when  a  given  sequence  of 
random  variables  changes  from  being  distributed  according  to  F(x)   to 
a  different  distribution  G(x).   In  particular  we  will  consider  the  case 
where  F(x)   satisfies  the  condition  of  Theorem  5*1  and  changes  to  G(x) 
which  also  satisfies  the  condition.   Inasmuch  as  the  distribution  of  the 
signed  sequential  ranks  depends  on  the  parameter  F(0)   we  will  of 
course  require   G(o)  f   F(o).   The  procedure  described  in  this  section 
is  still  applicable  to  cases  where   G  does  not  satisfy  the  condition 
in  Theorem  5«1  but  we  do  not  have  exact  results  in  such  instances. 
However  empirical,  results  are  presented  at  the  end  of  this  section 
bearing  on  the  effectiveness  of  the  procedure  for  special  cases. 

Let  X  ,  X  ,  ...   be  a  sequence  of  independent  random  variables 

(observations  on  a  process)  with  common  distribution  function  F(x) 

where  for  all  x  >  0  the  condition  F(-x)  =  F(0)[l-F(x)  +  F( -x)  ]  holds, 

and  let  Y  ,    Y ,    ...   be  the  corresponding  signed  sequential  ranks.   We 

define  the  cumulative  sums   S   =  Z.  +  Z^  +  . . .  +  Z   where   Z.  =  Y./i. 

n    1    2  n  l    r 

Since  the  condition  in  Theorem  5.1  is  satisfied  the  Z.  are  independent  and 


(6.1)   p(zn  =  t)  = 


1-F(0) 
n 

t  -i       2                   ^,1 

n   '    n   '                   n 

F(0) 

n 

12                          n 
t  '     '  n  '       n  '    ""•   ' 

,  -1 


Some  easy  computations  yield 
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(6.2) 


E(Z)    .kSS2l    .    (1  +  1, 

n  2  v  n 


Var(Z   ) 
n 


E(Z^) 


E(S   ) 

n 


Var(S   ) 
n 


1   _  /1-2F(0) 


3       2n 


^-MK^)2} 


6n 


1-2F(0) 
2 


n  +    I    1/i) 


n 


1-2F(0) 


'"K-  •  10  •  ft  ■  Of®  it  •  ? 


Although  tedious,  the  distribution  P(S  =  t)   can  he  computed 


exactly.   For  example 

p(s1  =  t)  =  p(z1  =  t)  = 


/ 


\ 


p(s2  -  t)  =  p(sx+z2  =  t)  = 


1-F(0) 

t 

=  1 

F(O) 

t 

=   -1 

2 

t 

3 

2 

(I-F(O))    F(O) 

t 

1 

2 

2 

(l-F(0))  F(O)     t  =  0 


(1-F(0))  F(O) 
2 

(F(0))2 


*--i 


t  =  -  g  ,  -  2 


and  in  general 

(6.3)   P(sn  =  t)  =  P(sn_1  =  t-Zn)  =  I  P(sn_x  =  t-x)  P(zn  =  x) 


where  x  ranges  over   -1, 


n-1 


1   1 


n       '   n  '  n  '        n 
The  procedure  we  will  propose  will  stop  the  process  whenever  S 

does  not  lie  in  some  fixed  open  interval   (b,  a)   where 
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-°o<"b<o<a<°°.   In  order  to  determine  the  operating  characteristics 
of  such  a  procedure  such  as  the  average  number  of  observations  until  the 
process  is  stopped  we  must  compute 

(6.4)   P(N  =  n)  -  P(b  <  S.  <  a,  i  =  1,  2,    ...  ,  n-1,  S  k   (b,a)) 

N  being  the  smallest  integral  value  for  which  S   does  not  lie  in  the 

00 

open  interval   (b,  a).   Then  E(N)  =  £  n  P(N  =  n)   gives  the  average 

n=l 
number  of  observations  as  a  function  of  a,   b  and  F(0).   In  order  to 

compute  the  probability  of  reaching  the  boundaries  b  and  a  for  the 

first  time  at  time  n  the  following  procedure  may  be  used.   We  define 

F  (x)  =  P(S  <  x),   F  (x)  =  P(S  <  x,  b  <  S  <  a)   and  in  general 


(6.5)       F  (x)  =  P(S  <  x,  b  <  S.  <  a   i  =  1,  2,  ...  ,  n-l) 
v    '        n         n  —         l 

It  follows  that  F  (x)  =  P(Zp  <  x-S.^  b  <  S1  <  a)  =  /   Fz  (x-y)  dF-^y) 
and  in  general 


(6.6)  F  (x)  =  /   Fz  (x-y)  dFn_x(y) 

^  b   n 


The  probability  of  reaching  boundary  a  for  the  first  time  at  n  is 

F  (o°)  -  F  (a)   and  the  probability  of  reaching  boundary  b  for  the 
n       n 

first  time  at  n  is   F  (b)  -  F  (-«>).   Using  these  probabilities  we 
can  also  calculate  E(N). 

Computations  of  the  probability  functions  in  (6.6)  could  be 
carried  out  and  the  computational  burden  lessened  somewhat  by  noting 
that  for  large  values  of  n,      the  Z   tend  to  become  identically 
distributed.   We  now  consider  some  approximations  to  E(N)   using  some 
results  from  sequential  analysis. 
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Using  (5.1)  the  characteristic  function  of  Z   is  given  by 


n 


,  v(ns      i  u/n   iu(l+l/n)   w_*   -i  u/n    -iu(l+l/n) 

(6.7)    cp  (u)  =  i^i  ^ =-« +  IM  5 I -  e   v ^ 

n        n        l_gi  u/n         n         1_g-i  u/n 


and  using  limited  expansions  of  exponentials  we  have 


iu  -iu 

(6.8)     cp(u)  =  lim  cp  (u)  =  (1-F(0))  e  .  -1  +  F(0)  ^7 

n               iu  iu 

n  -»  00 


which  is  the  characteristic  function  of  a  random  variable  with  density 

F(0)    -1  <  x  <  0 


(6.9)  f(x) 


1-F(0)    0  <  x  <  1 


For  large  values  of  n,   Z   has  approximately  the  density  of  (6.9) • 
The  moment  generating  function  associated  with  (6.9)  is 


(6.10)  M(t)  =  F(0)  i=|L  f  (1-F(0))  £-=-i 


which  exists  for  all  real  values  of  t.   As  an  approximation  we  will  use 

E(Z  )  =  1"2?0^.   In  the  cumulative  sums   S   =  Zn  +  Z^  +  .  .  .  +  Z   the 
n       2  n    1    2  n 

Z.   are  independent  and  as  noted,  not  identically  distributed.   However 
if  we  disregard  the  first  few  signed  sequential  ranks  and  start  later 
in  the  sequence  the  approximation  to  identically  distributed  random 
variables  improves.  As  before,  we  take  N  to  be  the  smallest  integral 
value  for  which  S   does  not  lie  in  (b,  a).   We  use  the  results  of 
Wald  [ 5 ]  in  the  sequel . 
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Consider  first  the  case  where  F(0)  =  l/2   (F  is  symmetric  about  0) 
Here  E(Z  )  =  0  and  using  (3-8)  of  [5] 


E(N)  = 


E 


(s?T) 


W 


ab 


eCz2)   e(z2) 

n      n 


3  a  b 


When  F(0)  /  l/2,   E(Zr)  /o  and  we  can  use  E(S  )  =  E(Z  )  •  E(N)   and 
the  approximation  E(S  )  =  aP(S  >  a)  +  b(l-P(S  >  a))   to  get 

-3  a  b  F(0)  =  1/2 


(6.11)      E(N)  =   < 


2b  +  2(a-b)  P(S  >  a) 


F(0)  j   1/2 


1-2F(0) 
Let  h  be  the  non  zero  root  of  M(t)  =  1.   A  further  approximation 


gives 


(6.12) 


bh 


P<SN  ^  a>  "  "Th 


where  of  course  h  depends  on  the  value  of  F(0).   Setting  M(t)  =  1 

1  +  t  -  e 

we  get  F(o)  = which  must  be  solved  for  t.  Each 

2t    - 1 
-  e  -  e 

solution  corresponding  to  a  fixed  value  of  F(0)   is  a  value  for  h  in 
(6.12)  yielding,  in  turn,  a  solution  to  (6.11). 

t 
2  -  e"  -  e" 

and  considering  the  numerator  a(t)  =  U(l-cosh  t)  +  2t  sinh  t  we  find 
a'(t)  =  sinh  t  +  2t  cosh  t  and  moreover 


t     t±\  1  +  t  -  e      m,      ,/.x   ^(l-cosh  t)  +  2t  sinh  t 

Let  g(t)  =  1 r  .   Then  g'(t)  =  -51 '- 5 

Mi-cosh  tr 


^ 


a' (t)  <  0     t  <  0 

a'(0)  =  0 

a' (t)  >  0     t  >  0 

Thus  a(t)  >  0,   making  g'(t)  >  0  and  g(t)   is  monotone  increasi 


ng 


in  t.   As   F(0)   increases  from  0  to  1  the  solution  to 

1  +  t    t 
F(o)  =  — say  h(F(0))  increases  from  -°°     to  «  .   Notice 

2  -  e   -  e 

that 


-t    .  -t   . 
-i  •        /-u\    i j    e   +te   -1    , 
lim  g(t)  =  lim   — — —  =  1 

t  -» °°       t  ->  °°  2e   -  1  -  e*"' 


t  ,  ,  t    2t 
,  .    i  ,\  ,.    e+te-e      _ 

lim  g(t)  =  lim   — ^ =  °   ' 

t  ->  -oo       t  ->  -°°  2e   -e   -1 

,    bh 

Now  for  h  =  h(F(o))   increasing,   P(SN  >  a)  =  — - tt-  is  decreasing 

e   -  e 

in  h  since  for 

bh 


g(h)  = 


ah    bh 
e   -  e 


we  have 

/  ,  x   (a+b)h   r   ah   ,  bh, 
g<(h)    (a~b)  e -  [ae   -  be   J 

z  ah     bh\2 
(e   -   e  ) 


and  considering  the  numerator  after  factoring  out  e        we  have  to 
show 

a-b  -  ae"bh  +  be"ah  <  0    for  all  h. 
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Writing  a   =  a/a-b,   0  =  -  b/a-b  we  have  a  +  p  =  l,  a,  p  >  0  and  we 
must  show  that  1  <  a  e^a~^h   +  p  e"^(a-b)h  _   ^  f  (fa)  =  Q  eP(a-b)h 

+  P  e"  ^a"  '        and  notice  that  f(o)  =  1,   f'(0)  =  0  with  f"(h)  >  0 


since 


f'(h)  =  ap(a-b)  eP(a-b>h  .  ap(a-b)  e-a(a-b)h 

f"(h)  .  a  p2(a-b)2  eP(a"b)h  +  ^(a-b)2  e-a(a-b)h 

Thus   f(h)   attains  its  minimum  value  at  h  =  0.   For  increasing  values 
of  F(0)   the  corresponding  values  of  h  =  h(F(0))   increase  and 
P(S  >  a)   decreases.   For  F(o)  ^  l/2  we  have 


2b  +  2(a-b/- 


bh 
1  -  e 


ah    bh 

(6.13)  e(h)  ^,(0)   'e 


In  particular  taking  b  =  -  a,   h  ^  0  we  have 


if-   lkx          „/wx  _  2a(l-e  a   -  sinh(ah))(l-cosh(h)) 
Ib.lAJ         E(N)  -  sinh(ah)(sinh(h)  -  h) 

For  h  =  0,   E(N)  =  3a   and   (6.14)   is  plotted  in  Figure  1  for 

1  +  t  -  e 
selected  values  of  a.   g(t)  = is  shown  in  Figure  2. 

2L       — t 
-  e  -  e 

E(N)   is  plotted  against  F(o)   in  Figure  3. 

Suppose  now  that  a  process  is  observed  according  to  some  measurable 
characteristic  and  we  have  a  sequence  X  ,  X  ,  ...  ,   distributed  accord- 
ing to  F(x)  where  F(x)   satisfies  the  condition  of  Theorem  5-1  and 
moreover  we  assume  F(0)  =  l/2 .   If  we  set  boundaries   (-a,  a)  a  >  0 
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Figure  2 
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and  use  the  rule  which  requires  us  to  stop  the  process  when  S  4   (-a,  a) 

2 
for  the  first  time  we  can  expect  to  continue  for  3a   observations  before 

stopping.   However  if  the  process  is  such  that  F(0)  /  l/2  we  will  stop 

the  process  in  the  reduced  average  time  as  given  in  Figure  1.   Similar 

computations  can  be  made  for  arbitrary  intevals   (b,  a)   using  (6.13) • 

However,  in  the  process  control  problem  we  wish  to  detect  when  a  change 

takes  place  in  the  distribution  of  the  basic  random  variables.   We  have 

seen  that  when  a  change  takes  place  from  F(x)   to  G(x)   at  some  point 

in  the  sequence,  the  signed  sequential  ranks  are  no  longer  independent 

in  general.   Suppose  the  change  is  to  a  distribution  G(x)   such  that 

the  condition  of  Theorem  5.1  is  still  satisfied  and  the  change  takes 

place  at  time  m.   Intuitively,  one  might  feel  that  for  large  values  of 

n  the  distribution  of  the  m  +  n    signed  sequential  rank  would  depend 

very  little  on  F  and  m.   This  being  so  we  could  assume  the  sequence 

{Z.}   to  be  independent  for  the  purpose  of  determing  the  expected  number 

of  observations  until  the  process  is  stopped.   For  example  suppose  we 

take   (b,  a)   as  the  continuation  interval  and  denote  (6.13)  by 

E(a,  b,  F(0)).   Given  that  S  =x,   b<x<a  the  expected  number  of 

additional  observations  under  G(x)   is 

(6.15)  E(N'Sm  =  x)  =  E(a"X'  b~X'  G(0)) 

2(b-x)     2(a-b)    eXh  -  ebh 

1-2G(0)    1-2  G(0)  '   ah    bh 

e   -  e 


The  conditional  distribution  for  S   is 

m 


6o 


P(b  <  S  <  x) 

P(S  <  x|b  <  S  <  a,  b  <  x  <  a)  =  p/_  ^  cm  ^ — r 

m  —        m    '  P(b  <  S  <  a) 


(6.16) 


r 


Fg  (x)  -  Fg  (b) 

m m 

Fg  (a)  -  Fs  (b) 

m        m 


C 


x  >  a 


b  <  x  <  a 


x  <  b 


and  the  unconditional  total  expected  number  of  observations  is  given 

by 


(6.17)    E(N,  m,  G(0))  =  m  + 


1 


FS  <a>  ~    FS  ' 

m        m 


(bT  /b3  E(N|Sm  ■  X)  «B  « 


To  lend  some  support  to  the  statement  that  for  large  values  of  n, 

the  distribution  of  Z     does  not  depend  too  much  on  m  and  the 

m+n 

distribution  of  X_,  ,  X...  ...  ,  X   (and  thus  could  be  taken  as   G(x) 

1'   2'     '      m 

to  justify  (6.I5))  we  examine  its  characteristic  function  as   n  ->  »  . 
We  have 

iuZ 


lim  cp  (u)  =  lim  E  (  e 


n+n 


n  -»  00 


n  -»  00 


u 


n-1 


=  lim  /   e  m+n(l-q(t)  +q(t)  e~ 


m 


l-p(t)  +  p(t)  e 


m+n 


dG(t) 
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n-1 

oo    •    U  _  •    U 

lim   '   e       (  l-q(t)  +  q(tj  e 

— »  00   -'_  \ 


n  ^oo   0 


m 

u 

-l 


l-p(t)  +p(t)  e   m+n  )  dG(-t) 


n-1 

•   u 

l-q(t)  +  q(t)  e1  m+nN)    dG(t) 


L  0  n  ~*  °° 


0 

n  ->  oo 


n-1 
lim  (l-q(t)  +  q(t)  e  X  m+n")    dG(-t) 


2   .   /   \  p  00       ,   i 

eiq(t)u  dG(t)  -  /  e-iq(t)u  dG(-t) 


0 


ai      •      f+\  G(t)     G(0)     .     ,s        G(-t) 

Also,  since  q(t)  =  3-^-  -  ^^     and  -  q(t)  =  ^j2  -  1 


we  have 


-1U  1U 

lim  cp  (u)  =  G(0)  il| +  (l-G(O))  e   .'  X 

n  iu  iu 

n  ->  00 


corresponding  to  (6.10). 

We  now  consider  a  case  where  we  have  a  change  from  a  distribution 
satisfying  the  condition  in  Theorem  5-1  to  another  such  distribution. 
Imagine  a  production  process  where  some  dimension  is  measured  on  the 
items  being  produced.   Let  these  measurements  be  X  ,  X   ,    ...   assumed 
to  be  independent  and  identically  distributed  as  F(x).  Each  item  is 
subject  to  inspection  and  if  X.  <   0  the  item  is  removed  from  the 
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production  line  with  probability  p.   The  result  is  a  new  sequence 
say  C  ,  C   ,    ...        and  we  call  this  random  censoring  and   {C. }   the 
censored  sequence.   The  distribution  of  C.   can  be  found  by 


>(C.  <  t)  =  X  P(C.  <  t|c.  =  X.^.  .)  P(X.^.  .  =  C.) 
1  -       **    1  -    1    i+j-1     i+j-1    i 


yp(x.x.  .  <  t  C.  =  X.  .,  . )  p(x.^.  .  =  c.) 

1=1  i+J-1    -  !  1+J-l  1+J-l  1 


p(x1  <  t|x1  =  cx)  [P(C.  =  X    ) 

j=l 


=  p(x1  <  t|x1  =  cx) 


For  t  <  0 


P(C.  <  t)  =  P(Xn  <  t|xn  =  cj 
1  —        1—1    1 


P(X  <  t,  X  <  0,  X   not  censored) 

=         p(cx  =  x1) 

P(X  <  t,    X   not  censored) 

=  l-pF(O) 

=  (1-p)  F(t)/l-pF(0)   . 


For  t  >  0 


P(Xn  <  t,  X,  C  0,  X   not  censored)  +  P(X  <  t,  0  <  X  ) 
P<C-  <  *)  ■  — — — l-pF(0) — ~ 


1  — 


F(t)  -  pF(0) 
l-pF(O) 
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For  random  censoring  when  X.  <  0  we  have 

1  - 


(6.9) 


P(C.  <  t)  =    <^ 

1  — 


r^Fioy F^        t  <  ° 


F(t)  -  pF(0) 

l-pF(0) t  >0 


In  a  similar  way  if  we  censor  with  probability  p  when  X.  >  0  we  have 


(6.10) 


F(t) 

1-p  +  pF(0) 


t  <  0 


P(C.  <  t)  =    <^ 

l  — 


(1-p)  F(t)  +  pF(0) 

1-p  +  PF(0)         t  >0 


Suppose  now  that  the  symmetry  condition  F(-t)  =  F(0)[l-F(t)  +  F(-t)] 

holds  for  all  t  >  0.   In  the  case  of  random  censoring  for  X.  <  0  we 
—  l  — 

have  for  t  >  0 


G(t)  =  P(C  <  t) 


l  — 


F(t)  =  [l-pF(O)]  G(t)  +  pF(0) 


F(_t)  =iz£F(0)   (   } 

x        1-p 


F(0)  =ii£EMG(0) 

1-p 


Using  these  relations  it  follows  that 


G(0)[l-G(t)  +  G(-t)]  = 


(1-p)  F(0)   l-F(t)  +  (1-p)  F(-t) 
1-p  F(0)        1-p  F(0) 


6k 


and  from  the  symmetry  condition  on  F  we  get 

G(0)[l-G(t)  +  G(-t)]  =  G(-t)   for  all  t  >  0  , 

with  a  change  from  F(o)   to  G(0)  =  -^ WnN   .   In  particular  for 

F(0)  =  l/2,  G(o)  =  (l-p)/2-p.   A  similar  calculation  for  censoring  when 

X.  >  0  shows  that  the  symmetry  condition  holds  for  G(t)   and 

G(0)  =  F(0)/l-p  +  pF(0).   For  F(o)  =  l/2,  G(o)  =  l/2  +  p. 

We  shall  now  compare  the  expected  number  of  observations  needed  to 

stop  a  process  subject  to  random  sampling  using  a  Shewhart  type  control 

chart  with  the  expected  number  needed  using  the  procedure  described 

above.   Consider  a  sequence  of  independent  observations  X  ,    X   ... 

with  common  continuous  symmetric  distribution  F(x) .   Subjecting  the 

X.   to  random  censoring  when  X.  <  0  we  get  from  (6.9)  the  distribution 
1  1  — 

of  the  censored  observations   C  ,  C  ,    ...        as 


(6.11) 


p(c.  <  t)  = 


H 2F(t) 


2F(t)  -  p 
2-p 


t  <  0 


t  >  0 


We  assume  here  that  when  p  =  0  the  process  is  in  control  and  that 
when  the  process  starts  some  fixed  value  of  p,  0  <  p  <  1   is  in  effect. 
If  p  >  0  we  want  to  stop  the  process  as  quickly  as  possible.   We  con- 
sider three  procedures: 
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procedure  1   -  when  C.  >  b  >  0  for  the  first  time,  stop  the  process 
procedure  2   -  when  |c.|  >  b  >  0  for  the  first  time,  stop  the  process 


procedure  3  -  when  JS  |  >  a  >  0  for  the  first  time,  stop  the  process 


Procedures  1  and  2  are  Shewhart  type  procedures  and  "b   is  usually 
taken  so  that  the  probability  of  stopping  at  a  particular  stage  is 
small  when  p  =  0.   Procedure  3  is  the  signed  sequential  rank  procedure 
previously  described  in  this  section.   Define  p   =  P(C.  >  b)   and 
p   =  P(|C.|  >  b)   assuming  p  =  0.   For  each  procedure  the  probability 
of  falling  outside  the  control  limit  for  the  first  time  at  the  n 
observation  is 

P^l-P^-1      i  =  .1,  2 

and     E   (N)    =  l/p.,  ,      E   (N)    =  l/pp      are   the  expected  number  of  observations 
taken  before   stopping.      E   (N)    =  3a       and  setting     E   (N)    =  E   (N)    =  E   (N) 
we  get 

px   =  l-F(b)    =  F(-b)    =  l/3a2 

Po    =  l-F(b)    -   F(-b)    =  2F(-G)    =  l/3a2    . 


For     p  >  0       p'    =  P(C.    >  b)    =  1-P(C.    <  b)    -  ^lL±l  =  § and 

11  i  -  2-p  (2_p)    5a2 

p^    =  P(C.    >  b)    +  P(C.    <  -b)    =  2|^-  +  |^  2   F(-b)    =  2F(-b)    =  l/3a2 
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Thus  for  p  >  0 


EX(N)  =  1/Pi  =  i^LpL 


E2(N)  =  1/p^  =  3a' 


-2a  +  J-k 


E3(N) 


ah    , 
e    -  1 

2ah 
e    -  1 


P/2  -  P 


and  notice  that  since  P(C.  <  0)  =  ^-  <   l/2,   it  follows  that  h  <  0. 

E  (N)  and  E  (N)   increase  quadrat ically  with  a  and  E   is  essentially 
linear  in  a.   For  example 


a  =  10 


a  =  20 


p 

1        3/4      1/2          1/4 

1     3/4 

1/2     1/4 

E1(N) 

150    187.5    225    262.5 

600    750 

900    1050 

E3(N) 

20        33.3        60     140 

1+0     66.6 

120        280 

and  procedure  2  is  insensitive  to  values  of  p  >  0.   The  values  of  h 
corresponding  to  p  =  1,  3/4,  l/2,  l/4  are  -00  ,  -2.2,  -.9,    -.5 
respectively. 

The  following  tabulated  results  were  obtained  empirically  to 
determine  the  effect  of  translation  of  the  mean  of  the  observations. 
We  considered  normal  observations  with  mean  u  and  variance  1  and 
stopped  sampling  when   |s  |  >  a  for  the  first  time  where 
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n  n  Y. 

S   =  I  Z.  =  I  -r1 

n    ,*-'-  1    .°,  1 

1 =1  i =1 


and  Y.   is  the  signed  sequential  rank  of  X.,  X.  ~  ^l([i,l).      For  each 
parameter  pair  (a^)  twenty  trials  were  performed  except  for  \x  =    .1, 
.2,    .3  where  fifty  trials  were  used.   Sample  averages,  sample  variances 
and  sample  standard  deviations  for  termination  time  N  are  given. 

a  =  10  a  =  20 


n 

N 

.1 

180.78 

.2 

IOI.78 

•  3 

69.95 

.4 

52.55 

•  5 

42.25 

.6 

39.60 

•  7 

36.55 

.8 

28.70 

.9 

28.00 

1.0 

28.80 

1-5 

23.40 

2.0 

22.  40 

2.5 

20.90 

3-0 

21.65 

2 
s 

s 

N 

2 
s 

s 

13H9.27 

115.97 

364.56 

31279.43 

176.85 

3306.46 

57.50 

179.04 

3345.18 

57.83 

710.12 

26.64 

130.40 

1437.18 

37.91 

324.99 

18.02 

IO8.65 

II60.87 

34.07 

139.14 

11.79 

77.25 

367.14 

19.16 

121.41 

11.01 

72.70 

171.69 

13.10 

128.05 

11.31 

67.45 

130.26 

11.41 

31.48 

5.61 

62.05 

115.31 

10.73 

38.31 

6.18 

53.55 

67.31 

8.20 

29-64 

5.44 

52.2  5 

39.77 

6.30 

7-93 

2.81 

44.00 

26.94 

5.19 

4.98 

2.23 

42.45 

12.05 

3.47 

5.25 

2.29 

41.55 

9-20 

3-03 

7.60 

2.75 

40.95 

13.83 

3.72 
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7-   Summary  and  Conclusions.   We  remarked  in  the  introduction  on 
the  paucity  of  nonparametric  sequential  procedures,  particularly  those 
based  on  ranks  of  observations.   The  author  feels  that  the  absence  of 
a  natural  way  of  assigning  ranks  to  observations,  as  the  observations 
are  taken,  without  reranking,  was  a  significant  cause  for  the  lack  of 
such  procedures.   The  sequential  ranking  schemes  defined  and  studied  in 
this  dissertation  provide  us  with  methods  whereby  ranks  may  be  assigned 
in  just  such  a  manner. 

In  order  to  use  the  methods  of  sequential  parametric  hypothesis 
testing  (Wald's  sequential  probability  ratio  test)  in  our  nonparametric 
setting,  we  must  replace  the  sequence  of  observations  X  ,  X  ,  ... 
by  a  sequence  of  ranks  R  ,  R  ,  ...   and  base  the  test  on  the  probability 
ratio  of  the  ranks.   This  can  be  done  by  the  sequential  ranking  scheme 
defined  in  Section  3-   One  basic  nonparametric  problem  is  the  two- 
sample  problem  where  we  must  decide  whether  or  not  an  X-  population 
and  a  Y-  population  have  the  same  probability  distribution.   This 
problem  was  treated  in  Section  h   in  the  special  case  where  the  alter- 
natives are  of  the  form  proposed  by  Lehmann  [l].   However  the  method 
proposed  in  Section  k   is  general  in  the  sense  that  in  order  to  carry 
out  the  test  one  must  only  be  able  to  compute  P(U  <  U  <  ...  <  U  ) 
where  the   U's   are  X's  and  Y's.   In  general  this  computation  is 
difficult,  but  for  special  alternatives  where  the  computation  is  fea- 
sible, the  method  in  Section  k   applies  directly. 

Notice  that  in  the  finite  sample  size  problem  nothing  is  sacrificed 
by  ranking  sequentially  (Theorem  3.1)  instead  of  using  ordinary  ranks. 
In  fact  a  little  is  gained  inasmuch  as  the  sequential  ranks  may  be 
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viewed  as  a  transformation  of  the  dependent  ordinary  ranks  into  the 
independent  sequential  ranks. 

Merely  ranking  observations  tells  us  nothing  of  their  location, 
except  relative  to  each  other.   In  order  to  take  into  account  the 
location  of  each  observation  relative  to  the  origin  as  well  as  its 
size  (absolute  value)  and  relative  location,  the  method  of  signed 
sequential  ranking  was  devised.   Contrary  to  sequential  ranks,  signed 
sequential  ranks  obtained  from  independent  identically  distributed 
observations  are  not  independent  in  general.   A  sufficient  condition 
on  the  distribution  of  the  observations  is  given  in  Theorem  5-1  to 
insure  that  the  signed  sequential  ranks  will  be  independent.   In  the 
process  control  problem  we  used  signed  sequential  ranks  of  observations 
whose  distributions  satisfied  this  condition.   This  simplified  the 
calculations  since  sums  of  independent  random  variables  were  involved 
in  the  analysis. 

The  methods  of  sequential  ranking  and  signed  sequential  ranking 
proposed  in  this  dissertation  are  new,  as  far  as  the  author  can  deter- 
mine, and  provide  a  natural  way  of  assigning  ranks  to  observations 
which  fits  into  the  theory  of  sequential  analysis  (hypothesis  testing) 
and  sequential  procedures  (process  control) .  All  the  attendant  distri- 
bution theory  results  are  new  and  the  condition  of  Theorem  5-1  which 
insures  the  independence  of  signed  sequential  ranks  is  the  only  one 
known  to  the  author. 

There  are  many  areas  for  further  investigation  suggested  by  this 
research.   In  the  sequential  probability  ratio  test  of  Section  k   we 
did  not  use  the  sequential  ranks  explicitly  (except  for  Z  in  equation 
(4.2))  in  the  definition  of  the  probability  ratio  S  .   S   can  be 
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written  in  terms  of   (Z  ,  Z  ,  ...  ,    Z  ) ,      the  sequential  ranks,  but  the 
expression  is  quite  complicated  and  it  is  much  more  convenient  to  use 
(4.1)  and  (4.2)  which  incorporate  the  most  recent  sequential  rank  only. 
Thus  the  "behavior  of  S   was  obtained  by  reference  to  A  ,   A  ,    ...   k^. 
More  general  results  are  needed  as  to  the  probability  of  termination  of 
P  (Z  )/P  (Z  )   for  alternatives  other  than  Lehmann  alternatives.   This 
is  necessary  because  under  the  alternative  hypothesis  the  sequential 
ranks  are  not  independent  generally  and  the  conservative  approximations 
A  =  1-3/l-Q!  remain  valid  for  successive  dependent  observations  when  the 
probability  is  one  that  the  procedure  will  ultimately  terminate. 

A  second  area  for  further  study  is  the  evaluation  of  the  rule 
given  in  Section  6  for  process  control  problems  when  changes  from  F 
to  G  are  not  of  the  form  presented  (e.g.   G(x)  =  F(x  +  A)  A  >  0) . 
Also  there  are  other  ad  hoc  rules  which  could  be  proposed  using  signed 
sequential  ranks  (or  sequential  ranks)  in  process  control  problems. 
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